[PATCH] D26602: [DAGCombiner] do not fold (fmul (fadd X, 1), Y) -> (fmad X, Y, Y) by default

Thu Dec 1 07:30:13 PST 2016

nhaehnle updated this revision to Diff 79910.
nhaehnle added a comment.

Rearrange the logic. It looks quite readable to me this way, and
clang-format-diff agrees with the formatting.

Thinking about the FMA case again, isn't it actually obvious? At least today
I'm quite convinced by the following argument:

> The mathematically exact result of `x * (y + 1)` is equal to that of `x * y +
>  x`. FMA produces the best rounding of this mathematically exact result. So
>  whatever happens to the rounding in (fmul x (fadd y 1.0)), the FMA variant
>  can only be more accurate.

Not sure why I didn't think of that before...

Tests are all passing with the changes from this patch, except one
unfortunate code quality regression in AMDGPU that I think should be
discussed separately.

https://reviews.llvm.org/D26602

Files:
  lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  test/CodeGen/AMDGPU/fma-combine.ll
  test/CodeGen/X86/fma_patterns.ll
  test/CodeGen/X86/fma_patterns_wide.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D26602.79910.patch
Type: text/x-patch
Size: 52919 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161201/f28e20da/attachment.bin>