[PATCH] Refactor and enhance FMA combine
Mehdi Amini
mehdi.amini at apple.com
Thu Apr 2 13:04:22 PDT 2015
> On Apr 2, 2015, at 12:59 PM, Stephen Canon <scanon at apple.com> wrote:
>
>
>> On Apr 2, 2015, at 3:48 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:
>>
>> Hi Olivier,
>>
>> Tentatively CC: Steve who might have an input on the validity of the two transforms below without fast-math.
>>
>>> On Apr 2, 2015, at 12:41 PM, Olivier Sallenave <ohsallen at us.ibm.com> wrote:
>>>
>>> Hi Mehdi,
>>>
>>>> I’d rather see the duplicated code (the one made obsolete by a correct canonicalization) removed from your patch (i.e. do not build technical debt), and a separate commits that implement the canonicalization part.
>>>
>>>
>>> Makes perfect sense, thanks for your feedback. I was able to do the canonicalization you suggest by adding the following transforms:
>>>
>>> (fsub (fneg A), B) -> (fneg (fadd A, B))
>>> (fpext (fneg x)) -> (fneg (fpext x))
>>>
>>> The problem is that I think those should be enabled only with -enable-unsafe-fp-math. So the two FMA combines that can be removed because of the canonicalization now only happen with -enable-unsafe-fp-math, whereas they used to work with -fp-contract=fast as well... Not sure what to do here.
>>
>> I overlook the fact that we have these pesky fine grain flags :(
>>
>> Now for these two particular transformations, they sound OK to me even without fast-math, but I rather have a numeric expert confirming because of all the possible edge cases.
>
>
> If A == –B, then (fsub (fneg A), B) is (fsub B, B) = +0 in default rounding. But (fneg (fadd A, B)) is (fneg +0) == –0, so should require fast-math / no-signed-zero / whatever. The second one is fine.
Thanks Steve!
I probably wouldn’t have catched the +0/-0 rounding.
Olivier, I suggest you keep you patch like it is, maybe with a comment mentioning that the two orthogonal flags (fp-contract and unsafe-fp-math) prevent you from relying on the canonicalization?
The canonicalization in fast-math is maybe still valuable by itself though and could be committed separately (Maybe Hal/Owen have an input on this)?
Thanks,
Mehdi
More information about the llvm-commits
mailing list