[llvm-dev] [RFC] Delaying phi-to-select transformation until later in the pass pipeline
    Philip Reames via llvm-dev 
    llvm-dev at lists.llvm.org
       
    Fri Aug 17 13:17:15 PDT 2018
    
    
  
On 08/17/2018 09:02 AM, Amara Emerson wrote:
>
>> On Aug 15, 2018, at 10:57 PM, Hal Finkel via llvm-dev 
>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>>
>> On 08/15/2018 02:38 PM, Philip Reames via llvm-dev wrote:
>>>
>>> I'm concerned that we're focusing on one side of this.  Let me point 
>>> out a few concerns w/changing the canonical form here:
>>>
>>>  1. LICM does not know how to hoist or sink regions.  It does know
>>>     how to hoist and sink selects.
>>>
> I think we should teach LICM to do this eventually.
Agreed.  I just don't currently see patches to do so.  Once LICM 
supports region hoisting, much of my concern disappears.
>>>
>>>  2. InstCombine has limited support for triangles/diamonds, but
>>>     fairly extensive support for selects.
>>>  3. EarlyCSE and GVN do not know how to eliminate fully redundant
>>>     triangles/diamonds.  PRE *may* get some of these cases, but that
>>>     is by no means guaranteed or likely to be robust.
>>>
> Agreed, we’ll need a plan to deal with these issues.
Again, I'd love to see these issues fixed.  Once that's done, the 
convert about phi vs select as canonical form is much less relevant.
>>>
>>> My personal opinion is that selects are the appropriate canonical form.
>>>
>>> For the one of the specific cases mentioned, teaching GVN to do FRE 
>>> and PRE for loads from selects of pointers just doesn't seem that 
>>> painful.  I would be really tempted to do that instead.  Similarly, 
>>> walking facts back from select uses in CVP seems generally useful 
>>> since we have use dependent facts in a bunch of contexts, not just 
>>> selects.  (Call arguments for instance,  non-null from unconditional 
>>> deref, etc..)
>>>
>>> To be clear, I am raising concerns, not actively objecting to this.  
>>> If you want to move forward and commit to work through all of the 
>>> issues identified I wont actively stand in the way.
>>>
>>
>> As I've expressed in the past, I think that not using select as part 
>> of our canonical form is potentially a superior design. However, it 
>> would be a major change. In addition to the issues that Philip 
>> mentions, there's also the fact that we'll just have more basic 
>> blocks and that, in itself, could lead to an increase in compile 
>> time. However, working through these issues will likely leave us with 
>> a more-robust optimizer.
>>
>> See also: https://bugs.llvm.org/show_bug.cgi?id=34603#c19
> Canonicalizing to phis is my preference too.
>>
>>  -Hal
>>
>>> Philip
>>>
>>>
>>> On 08/14/2018 12:39 PM, Sanjay Patel via llvm-dev wrote:
>>>> I didn't look closely at the new patch, but this appears to be a 
>>>> small extension to:
>>>> https://reviews.llvm.org/D38566
>>>> ...and the GVN-based reasons for delaying transformation to 
>>>> 'select' are discussed in detail in the motivating bug for that patch:
>>>> https://bugs.llvm.org/show_bug.cgi?id=34603
>>>>
>>>> So this sounds like the right direction to me. Note that there was 
>>>> objection to the implementation (a pile of pass options vs. 
>>>> uniquely named passes).
>>>>
>>>> Here's another motivating bug where early transform to select 
>>>> prevents optimization:
>>>> https://bugs.llvm.org/show_bug.cgi?id=36760
>>>>
>>>> Is that case affected by this patch?
>>>>
>>>>
>>>> On Tue, Aug 14, 2018 at 11:17 AM, John Brawn via llvm-dev 
>>>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>>>
>>>>     Summary
>>>>     =======
>>>>
>>>>     I'm planning on adjusting SimplifyCFG so that it doesn't turn
>>>>     two-entry phi
>>>>     nodes into selects until later in the pass pipeline, to give
>>>>     passes which can
>>>>     understand phis but not selects more opportunity to optimize.
>>>>     The thing I'm
>>>>     trying to do which made me think of doing this is described
>>>>     below, but from the
>>>>     benchmarking I've done it looks like this is overall a good
>>>>     idea regardless of
>>>>     if I manage to get that done or not.
>>>>
>>>>     Motivation
>>>>     ==========
>>>>
>>>>     My goal is to get clang to optimize some code containing a call to
>>>>     std::min_element which is dereferenced, so something like:
>>>>
>>>>       float min_element_example(float *data, int size)
>>>>       {
>>>>         return *std::min_element(data, data+size);
>>>>       }
>>>>
>>>>     which, after inlining a specialization, looks like:
>>>>
>>>>       float min_element_example_inlined(float *first, float *last)
>>>>       {
>>>>         for (float *p = first; p != last; ++p)
>>>>         {
>>>>           if (*p < *first)
>>>>             first = p;
>>>>         }
>>>>         return *first;
>>>>       }
>>>>
>>>>     There are two loads in the loop, *p and *first, but actually
>>>>     the load *p can be
>>>>     eliminated by using either the previous load *p or the previous
>>>>     *first,
>>>>     depending on if the if-condition was taken or not. However the
>>>>     "if (*p < *first) first = p" gets turned by simplifycfg into a
>>>>     select and this
>>>>     makes optimizing this a lot harder because you no longer have
>>>>     distinct paths
>>>>     through the CFG.
>>>>
>>>>     I have some ideas on how to do the optimization (see my
>>>>     previous RFC "Making GVN
>>>>     able to visit the same block more than once" posted in April,
>>>>     though I've
>>>>     decided that the specific idea presented there isn't the right
>>>>     way to do it),
>>>>     but I think the first step is to make sure we don't have a
>>>>     select when we try
>>>>     to optimise this.
>>>>
>>>>     Approach
>>>>     ========
>>>>
>>>>     I've posted a patch to https://reviews.llvm.org/D50723
>>>>     <https://reviews.llvm.org/D50723> showing what I'm
>>>>     intending to do. An extra parameter is added to SimplifyCFG to
>>>>     control whether
>>>>     two-entry phi nodes are converted into select, and this is set
>>>>     to false in all
>>>>     instances before the end of module simplification. At the end
>>>>     of module
>>>>     simplification we do SimplifyCFG, then Instcombine to optimise
>>>>     the selects that
>>>>     are introduced, then EarlyCSE to eliminate common
>>>>     subexpressions introduced by
>>>>     instcombine.
>>>>
> Is scheduling another simplifycfg, instcombine + earlycse pass really 
> necessary? I’m concerned about the compile time impact.
>
> Cheers,
> Amara
>>>>
>>>>
>>>>     Benchmark Results
>>>>     =================
>>>>
>>>>     These are performance differences reported by LNT when running
>>>>     llvm-test-suite,
>>>>     spec2000, and spec2006 at -O3 with and without the patch linked
>>>>     above (using
>>>>     trunk llvm from a week or so ago).
>>>>
>>>>     AArch64 results on ARM Cortex-A72:
>>>>
>>>>     Performance Regressions - execution_time                      
>>>>     Change
>>>>     SingleSource/Benchmarks/Shootout/Shootout-ary3                
>>>>            9.48%
>>>>     MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt            
>>>>           3.79%
>>>>     SingleSource/Benchmarks/CoyoteBench/huffbench                  
>>>>           1.40%
>>>>
>>>>     Performance Improvements - execution_time                    
>>>>      Change
>>>>     MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl        
>>>>         -23.74%
>>>>     External/SPEC/CINT2000/256.bzip2/256.bzip2                    
>>>>           -9.82%
>>>>     MultiSource/Benchmarks/TSVC/Searching-flt/Searching-flt        
>>>>          -9.57%
>>>>     MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt 
>>>>          -4.38%
>>>>     MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt
>>>>     -3.94%
>>>>     MultiSource/Benchmarks/TSVC/Packing-dbl/Packing-dbl            
>>>>          -3.44%
>>>>     External/SPEC/CFP2006/453.povray/453.povray                    
>>>>          -2.50%
>>>>     SingleSource/Benchmarks/Adobe-C++/stepanov_vector              
>>>>          -1.49%
>>>>
>>>>     X86_64 results on Intel Xeon E5-2690:
>>>>
>>>>     Performance Regressions - execution_time    Change
>>>>     MultiSource/Benchmarks/Ptrdist/yacr2/yacr2         5.62%
>>>>
>>>>     Performance Improvements - execution_time   Change
>>>>     SingleSource/Benchmarks/Misc-C++/Large/sphereflake -4.43%
>>>>     External/SPEC/CINT2006/456.hmmer/456.hmmer        -2.50%
>>>>     External/SPEC/CINT2006/464.h264ref/464.h264ref    -1.60%
>>>>     MultiSource/Benchmarks/nbench/nbench              -1.19%
>>>>     SingleSource/Benchmarks/Adobe-C++/functionobjects -1.07%
>>>>
>>>>     I had a brief look at the regressions and they all look to be
>>>>     caused by
>>>>     getting bad luck with branch mispredictions: I looked into the
>>>>     Shootout-ary3 and
>>>>     yacr2 cases and in both the hot code path was the same with and
>>>>     without the
>>>>     patch, but with more mispredictions probably caused by changes
>>>>     elsewhere.
>>>>
>>>>     John
>>>>
>>>>     _______________________________________________
>>>>     LLVM Developers mailing list
>>>>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>>>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>> -- 
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180817/6456152a/attachment.html>
    
    
More information about the llvm-dev
mailing list