[PATCH] D124867: [SLP][NFC] Pre-commit test showing horizontal reduction preventing FMA

Wed May 11 08:11:04 PDT 2022

wjschmidt added a comment.

In D124867#3506304 <https://reviews.llvm.org/D124867#3506304>, @ABataev wrote:

> In D124867#3506289 <https://reviews.llvm.org/D124867#3506289>, @wjschmidt wrote:
>
>> In D124867#3506218 <https://reviews.llvm.org/D124867#3506218>, @ABataev wrote:
>>
>>>>> Also, why these sequences are not optimized by InsrtuctionCombiner to FMA?
>>>>
>>>> Phase ordering -- it seems the FMA combining happens quite late in the pipeline.  When we replace the adds with a horizontal reduction, the opportunity is removed.
>>>
>>> Why, could you investigate it?
>>
>> I'll have to refresh my memory, but my recollection is that the FMA combining is done in the MI level instruction combiner.
>
> Why? Are there any target-caused limitations?

I can't speak to the choices that were made by the InstCombine designers.  There don't appear to be any remarks about it in the code.  I do see that InstCombineMulDivRem.cpp goes out of its way to create opportunities for later FMA combining by generating FMul followed by FAdd or FSub, so it appears to be a deliberate choice not to create an FMA.  There are also some small optimizations on existing Intrinsic::fma in InstCombineCalls.cpp, but nothing that creates one.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124867/new/

https://reviews.llvm.org/D124867