[llvm-dev] [RFC] Delaying phi-to-select transformation until later in the pass pipeline
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Fri Aug 17 13:17:15 PDT 2018
On 08/17/2018 09:02 AM, Amara Emerson wrote:
>
>> On Aug 15, 2018, at 10:57 PM, Hal Finkel via llvm-dev
>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>>
>> On 08/15/2018 02:38 PM, Philip Reames via llvm-dev wrote:
>>>
>>> I'm concerned that we're focusing on one side of this. Let me point
>>> out a few concerns w/changing the canonical form here:
>>>
>>> 1. LICM does not know how to hoist or sink regions. It does know
>>> how to hoist and sink selects.
>>>
> I think we should teach LICM to do this eventually.
Agreed. I just don't currently see patches to do so. Once LICM
supports region hoisting, much of my concern disappears.
>>>
>>> 2. InstCombine has limited support for triangles/diamonds, but
>>> fairly extensive support for selects.
>>> 3. EarlyCSE and GVN do not know how to eliminate fully redundant
>>> triangles/diamonds. PRE *may* get some of these cases, but that
>>> is by no means guaranteed or likely to be robust.
>>>
> Agreed, we’ll need a plan to deal with these issues.
Again, I'd love to see these issues fixed. Once that's done, the
convert about phi vs select as canonical form is much less relevant.
>>>
>>> My personal opinion is that selects are the appropriate canonical form.
>>>
>>> For the one of the specific cases mentioned, teaching GVN to do FRE
>>> and PRE for loads from selects of pointers just doesn't seem that
>>> painful. I would be really tempted to do that instead. Similarly,
>>> walking facts back from select uses in CVP seems generally useful
>>> since we have use dependent facts in a bunch of contexts, not just
>>> selects. (Call arguments for instance, non-null from unconditional
>>> deref, etc..)
>>>
>>> To be clear, I am raising concerns, not actively objecting to this.
>>> If you want to move forward and commit to work through all of the
>>> issues identified I wont actively stand in the way.
>>>
>>
>> As I've expressed in the past, I think that not using select as part
>> of our canonical form is potentially a superior design. However, it
>> would be a major change. In addition to the issues that Philip
>> mentions, there's also the fact that we'll just have more basic
>> blocks and that, in itself, could lead to an increase in compile
>> time. However, working through these issues will likely leave us with
>> a more-robust optimizer.
>>
>> See also: https://bugs.llvm.org/show_bug.cgi?id=34603#c19
> Canonicalizing to phis is my preference too.
>>
>> -Hal
>>
>>> Philip
>>>
>>>
>>> On 08/14/2018 12:39 PM, Sanjay Patel via llvm-dev wrote:
>>>> I didn't look closely at the new patch, but this appears to be a
>>>> small extension to:
>>>> https://reviews.llvm.org/D38566
>>>> ...and the GVN-based reasons for delaying transformation to
>>>> 'select' are discussed in detail in the motivating bug for that patch:
>>>> https://bugs.llvm.org/show_bug.cgi?id=34603
>>>>
>>>> So this sounds like the right direction to me. Note that there was
>>>> objection to the implementation (a pile of pass options vs.
>>>> uniquely named passes).
>>>>
>>>> Here's another motivating bug where early transform to select
>>>> prevents optimization:
>>>> https://bugs.llvm.org/show_bug.cgi?id=36760
>>>>
>>>> Is that case affected by this patch?
>>>>
>>>>
>>>> On Tue, Aug 14, 2018 at 11:17 AM, John Brawn via llvm-dev
>>>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>>>
>>>> Summary
>>>> =======
>>>>
>>>> I'm planning on adjusting SimplifyCFG so that it doesn't turn
>>>> two-entry phi
>>>> nodes into selects until later in the pass pipeline, to give
>>>> passes which can
>>>> understand phis but not selects more opportunity to optimize.
>>>> The thing I'm
>>>> trying to do which made me think of doing this is described
>>>> below, but from the
>>>> benchmarking I've done it looks like this is overall a good
>>>> idea regardless of
>>>> if I manage to get that done or not.
>>>>
>>>> Motivation
>>>> ==========
>>>>
>>>> My goal is to get clang to optimize some code containing a call to
>>>> std::min_element which is dereferenced, so something like:
>>>>
>>>> float min_element_example(float *data, int size)
>>>> {
>>>> return *std::min_element(data, data+size);
>>>> }
>>>>
>>>> which, after inlining a specialization, looks like:
>>>>
>>>> float min_element_example_inlined(float *first, float *last)
>>>> {
>>>> for (float *p = first; p != last; ++p)
>>>> {
>>>> if (*p < *first)
>>>> first = p;
>>>> }
>>>> return *first;
>>>> }
>>>>
>>>> There are two loads in the loop, *p and *first, but actually
>>>> the load *p can be
>>>> eliminated by using either the previous load *p or the previous
>>>> *first,
>>>> depending on if the if-condition was taken or not. However the
>>>> "if (*p < *first) first = p" gets turned by simplifycfg into a
>>>> select and this
>>>> makes optimizing this a lot harder because you no longer have
>>>> distinct paths
>>>> through the CFG.
>>>>
>>>> I have some ideas on how to do the optimization (see my
>>>> previous RFC "Making GVN
>>>> able to visit the same block more than once" posted in April,
>>>> though I've
>>>> decided that the specific idea presented there isn't the right
>>>> way to do it),
>>>> but I think the first step is to make sure we don't have a
>>>> select when we try
>>>> to optimise this.
>>>>
>>>> Approach
>>>> ========
>>>>
>>>> I've posted a patch to https://reviews.llvm.org/D50723
>>>> <https://reviews.llvm.org/D50723> showing what I'm
>>>> intending to do. An extra parameter is added to SimplifyCFG to
>>>> control whether
>>>> two-entry phi nodes are converted into select, and this is set
>>>> to false in all
>>>> instances before the end of module simplification. At the end
>>>> of module
>>>> simplification we do SimplifyCFG, then Instcombine to optimise
>>>> the selects that
>>>> are introduced, then EarlyCSE to eliminate common
>>>> subexpressions introduced by
>>>> instcombine.
>>>>
> Is scheduling another simplifycfg, instcombine + earlycse pass really
> necessary? I’m concerned about the compile time impact.
>
> Cheers,
> Amara
>>>>
>>>>
>>>> Benchmark Results
>>>> =================
>>>>
>>>> These are performance differences reported by LNT when running
>>>> llvm-test-suite,
>>>> spec2000, and spec2006 at -O3 with and without the patch linked
>>>> above (using
>>>> trunk llvm from a week or so ago).
>>>>
>>>> AArch64 results on ARM Cortex-A72:
>>>>
>>>> Performance Regressions - execution_time
>>>> Change
>>>> SingleSource/Benchmarks/Shootout/Shootout-ary3
>>>> 9.48%
>>>> MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt
>>>> 3.79%
>>>> SingleSource/Benchmarks/CoyoteBench/huffbench
>>>> 1.40%
>>>>
>>>> Performance Improvements - execution_time
>>>> Change
>>>> MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl
>>>> -23.74%
>>>> External/SPEC/CINT2000/256.bzip2/256.bzip2
>>>> -9.82%
>>>> MultiSource/Benchmarks/TSVC/Searching-flt/Searching-flt
>>>> -9.57%
>>>> MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt
>>>> -4.38%
>>>> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt
>>>> -3.94%
>>>> MultiSource/Benchmarks/TSVC/Packing-dbl/Packing-dbl
>>>> -3.44%
>>>> External/SPEC/CFP2006/453.povray/453.povray
>>>> -2.50%
>>>> SingleSource/Benchmarks/Adobe-C++/stepanov_vector
>>>> -1.49%
>>>>
>>>> X86_64 results on Intel Xeon E5-2690:
>>>>
>>>> Performance Regressions - execution_time Change
>>>> MultiSource/Benchmarks/Ptrdist/yacr2/yacr2 5.62%
>>>>
>>>> Performance Improvements - execution_time Change
>>>> SingleSource/Benchmarks/Misc-C++/Large/sphereflake -4.43%
>>>> External/SPEC/CINT2006/456.hmmer/456.hmmer -2.50%
>>>> External/SPEC/CINT2006/464.h264ref/464.h264ref -1.60%
>>>> MultiSource/Benchmarks/nbench/nbench -1.19%
>>>> SingleSource/Benchmarks/Adobe-C++/functionobjects -1.07%
>>>>
>>>> I had a brief look at the regressions and they all look to be
>>>> caused by
>>>> getting bad luck with branch mispredictions: I looked into the
>>>> Shootout-ary3 and
>>>> yacr2 cases and in both the hot code path was the same with and
>>>> without the
>>>> patch, but with more mispredictions probably caused by changes
>>>> elsewhere.
>>>>
>>>> John
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180817/6456152a/attachment.html>
More information about the llvm-dev
mailing list