[llvm-dev] RFC: Moving DAG heuristic-based transforms to MI passes

Sat Jan 28 02:19:14 PST 2017

In fact to commit the change before dealing with worst-case performance 
is a good idea because here we have 2 different issues. But the main 
idea of this RFC is an attempt to show the better approach to to these 
kinds of transformations and to suggest to use this approach in the future.

At the same time, I'm trying to explain that this patch is not the 
performance one because the generated code is almost identical to what 
we have just now. It is a suggestion to change the strategy in such 
transformations elaborating. If the community accept this new strategy 
we're ready to introduce new similar transformations, automate the 
framework, etc. But of course it will be themes for new RFCs and 
discussions.

On 1/27/2017 11:56 PM, Hal Finkel wrote:
> On 01/27/2017 10:30 AM, Andrew V. Tischenko via llvm-dev wrote:
>
>> All llvm-devs,
>>
>> We're going to introduce the new possible implementation for such 
>> optimizations as reciprocal estimation instead of fdiv. In short it's 
>> a replacement of fdiv instruction (which is very expensive in most of 
>> CPUs) with alternative sequence of instructions which is usually 
>> cheaper but has appropriate precision (see genReciprocalDiv in 
>> lib/Target/X86/X86InstrInfo.cpp for details). There are other similar 
>> optimizations like usage of rsqrt, etc. but at the moment we're 
>> dealing with recip estimation only - see 
>> https://reviews.llvm.org/D26855 for details.
>>
>> The current version of optimization is done at DAG Combiner level 
>> when we don't know the exact target instructions which will be used 
>> by CodeGen. As result we don't know the real cost of the alternative 
>> sequence and can't compare that cost with the cost of the single 
>> fdiv. As result the decision to select an alternative sequence (made 
>> on compiler options only) could be wrong because modern CPUs 
>> introduce very cheap fdiv and we should use it directly.
>>
>> We suggest to move the implementation from DAG heuristics to 
>> MI-scheduler-based transformations (Machine Combiner). At that time 
>> we know exact target instructions and are able to use scheduler-based 
>> cost model. This knowledge allows as to select proper code sequence 
>> for final target code generation.
>>
>> A possible disadvantage of the new implementation is compile time 
>> increasing (as discussed in D26855), but we expect to make 
>> improvements in that area. For the initial change (reciprocal 
>> transform), any difference is limited to fast-math compilations.
>>
>> Any objections, suggestion, comments?
>>
>
> Are you asking whether is okay to commit the change first and then 
> look at the MachineCombiner's worst-case performance in followup? In 
> general, I think that moving to using the MachineCombiner for these 
> kinds of transformations, where there are complex tradeoffs between 
> latency, throughput, etc., is the right direction.
>
>  -Hal
>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>