[llvm-dev] RFC: Moving DAG heuristic-based transforms to MI passes

Fri Jan 27 12:56:54 PST 2017

On 01/27/2017 10:30 AM, Andrew V. Tischenko via llvm-dev wrote:

> All llvm-devs,
>
> We're going to introduce the new possible implementation for such 
> optimizations as reciprocal estimation instead of fdiv. In short it's 
> a replacement of fdiv instruction (which is very expensive in most of 
> CPUs) with alternative sequence of instructions which is usually 
> cheaper but has appropriate precision (see genReciprocalDiv in 
> lib/Target/X86/X86InstrInfo.cpp for details). There are other similar 
> optimizations like usage of rsqrt, etc. but at the moment we're 
> dealing with recip estimation only - see 
> https://reviews.llvm.org/D26855 for details.
>
> The current version of optimization is done at DAG Combiner level when 
> we don't know the exact target instructions which will be used by 
> CodeGen. As result we don't know the real cost of the alternative 
> sequence and can't compare that cost with the cost of the single fdiv. 
> As result the decision to select an alternative sequence (made on 
> compiler options only) could be wrong because modern CPUs introduce 
> very cheap fdiv and we should use it directly.
>
> We suggest to move the implementation from DAG heuristics to 
> MI-scheduler-based transformations (Machine Combiner). At that time we 
> know exact target instructions and are able to use scheduler-based 
> cost model. This knowledge allows as to select proper code sequence 
> for final target code generation.
>
> A possible disadvantage of the new implementation is compile time 
> increasing (as discussed in D26855), but we expect to make 
> improvements in that area. For the initial change (reciprocal 
> transform), any difference is limited to fast-math compilations.
>
> Any objections, suggestion, comments?
>

Are you asking whether is okay to commit the change first and then look 
at the MachineCombiner's worst-case performance in followup? In general, 
I think that moving to using the MachineCombiner for these kinds of 
transformations, where there are complex tradeoffs between latency, 
throughput, etc., is the right direction.

  -Hal

>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory