[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

Tue Sep 23 06:58:14 PDT 2014

Hi Oleg,

On 22/09/14 17:56, Oleg Ranevskyy wrote:
> Hi Duncan,
>
> On 17.09.2014 21:10, Duncan Sands wrote:
>> Hi Oleg,
>>
>> On 17/09/14 18:45, Oleg Ranevskyy wrote:
>>> Hi,
>>>
>>> Thank you for all your helpful comments.
>>>
>>> To sum up, below is the list of correct folding examples for fadd:
>>>         (1)  fadd %x, -0.0            ->     %x
>>>         (2)  fadd undef, undef    ->     undef
>>>         (3)  fadd %x, undef         ->     NaN  (undef is a NaN which is
>>> propagated)
>>>
>>> Looking through the code I found the "NoNaNs" flag accessed through an instance
>>> of the FastMathFlags class.
>>> (2) and (3) should probably depend on it.
>>> If the flag is set, (2) and (3) cannot be folded as there are no NaNs and we are
>>> not guaranteed to get an arbitrary bit pattern from fadd, right?
>>
>> I think it's exactly the other way round: if NoNans is set then you can fold
>> (2) and (3) to undef.  That's because (IIRC) the NoNans flag promises that no
>> NaNs will be used by the program. However "undef" could be a NaN, thus the
>> promise is broken, meaning the program is performing undefined behaviour, and
>> you can do whatever you want.
> Oh, I see the point now. I thought if NoNaNs was set then no NaNs were possible
> at all. But undef is still an arbitrary bit pattern that might occasionally be
> the same as the one of a NaN. Thank you for the explanation.
>
> Thus, "fadd/fsub/fmul/fdiv undef, undef" can always be folded to undef, whereas
> "fadd/fsub/fmul/fdiv %x, undef" is folded to either undef (NoNaNs is set) or a
> NaN (NoNaNs is not set).

for fmul and fdiv, the reasoning does depend on fmul %x, 1.0 always being equal 
to %x (likewise: fdiv %x, 1.0 being equal to %x).  Is this true?

Ciao, Duncan.

>
> Oleg
>>
>>>
>>> Other arithmetic FP operations (fsub, fmul, fdiv) also propagate NaNs. Thus, the
>>> same rules seem applicable to them as well:
>>> ---------------------------------------------------------------------
>>> - fdiv:
>>>         (4) "fdiv %x, undef" is now folded to undef.
>>
>> But should be folded to NaN, not undef.
>>
>>>         The code comment states this is done because undef might be a sNaN. We
>>> can't rely on sNaNs as they can either be masked or the platform might not have
>>> FP exceptions at all. Nevertheless, such folding is still correct due to the NaN
>>> propagation rules we found in the Standard - undef might be chosen to be a NaN
>>> and its payload will be propagated.
>>>         Moreover, this looks similar to (3) and can be folded to a NaN. /Is it
>>> worth doing?/
>>
>> As the current folding to undef is wrong, it has to be fixed.
>>
>>>
>>>         (5) fdiv undef, undef    ->    undef
>>
>> Yup.
>>
>>> ---------------------------------------------------------------------
>>> - fmul:
>>>         (6) fmul undef, undef    ->    undef
>>
>> Yup.
>>
>>>         (7) fmul %x, undef        -> NaN or undef (undef is a NaN, which is
>>> propagated)
>>
>> Should be folded to NaN, not undef.
>>
>>> ---------------------------------------------------------------------
>>> - fsub:
>>>         (8) fsub %x, -0.0           ->    %x  (if %x is not -0.0; works this way
>>> now)
>>
>> Should this be: fsub %x, +0.0 ?
> fsub %x, +0.0 is also covered and always folded to %x.
> The version with -0.0 is similar except it additionally checks if %x is not -0.0.
>>
>>>         (9) fsub %x, undef        -> NaN or undef (undef is a NaN, which is
>>> propagated)
>>
>> Should fold to NaN not undef.
>>
>>>       (10) fsub undef, undef   -> undef
>>
>> Yup.
>>
>> Ciao, Duncan.
>>
>>> ---------------------------------------------------------------------
>>>
>>> I will be very thankful if you could review this final summary and share your
>>> thoughts.
>>>
>>> Thank you.
>>>
>>> P.S. Sorry for bothering you again and again.
>>> Just want to make sure I clearly understand the subject in order to make correct
>>> code changes and to be able to help others with this in the future.
>>>
>>> Kind regards,
>>> Oleg
>>>
>>> On 16.09.2014 21:42, Duncan Sands wrote:
>>>> On 16/09/14 19:37, Owen Anderson wrote:
>>>>> As far as I know, LLVM does not try very hard to guarantee constant folded
>>>>> NaN payloads that match exactly what the target would generate.
>>>>
>>>> I'm with Owen here.  Unless ARM people object, I think it is reasonable to say
>>>> that at the LLVM IR level we may assume that the IEEE rules are followed.
>>>>
>>>> Ciao, Duncan.
>>>>
>>>>>
>>>>> —Owen
>>>>>
>>>>>> On Sep 16, 2014, at 10:30 AM, Oleg Ranevskyy <llvm.mail.list at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> Hi Duncan,
>>>>>>
>>>>>> I reread everything we've discussed so far and would like to pay closer
>>>>>> attention to the the ARM's FPSCR register mentioned by Stephen.
>>>>>> It's really possible on ARM systems that floating point operations on one or
>>>>>> more qNaN operands return a NaN different from the operands. I.e. operand
>>>>>> NaN is not propagated. This happens when the "default NaN" flag is set in
>>>>>> the FPSCR (floating point status and control register). The result in this
>>>>>> case is some default NaN value.
>>>>>>
>>>>>> This means "fadd %x, -0.0", which is currently folded to %x by
>>>>>> InstructionSimplify, might produce a different result if %x is a NaN. This
>>>>>> breaks the NaN propagation rules the IEEE standard establishes and
>>>>>> significantly reduces folding capabilities for the FP operations.
>>>>>>
>>>>>> This also applies to "fadd undef, undef" and "fadd %x, undef". We can't rely
>>>>>> on getting an arbitrary NaN here on ARMs.
>>>>>>
>>>>>> Would you be able to confirm this please?
>>>>>>
>>>>>> Thank you in advance for your time!
>>>>>>
>>>>>> Kind regards,
>>>>>> Oleg
>>>>>>
>>>>>> On 10.09.2014 22:50, Duncan Sands wrote:
>>>>>>> Hi Oleg,
>>>>>>>
>>>>>>> On 01/09/14 18:46, Oleg Ranevskyy wrote:
>>>>>>>> Hi Duncan,
>>>>>>>>
>>>>>>>> I looked through the IEEE standard and here is what I found:
>>>>>>>>
>>>>>>>> *6.2 Operations with NaNs*
>>>>>>>> /"For an operation with quiet NaN inputs, other than maximum and minimum
>>>>>>>> operations, if a floating-point result is to be delivered the result shall
>>>>>>>> be a
>>>>>>>> quiet NaN which should be one of the input NaNs"/.
>>>>>>>>
>>>>>>>> *6.2.3 NaN propagation*
>>>>>>>> /"An operation that propagates a NaN operand to its result and has a
>>>>>>>> single NaN
>>>>>>>> as an input should produce a NaN with the payload of the input NaN if
>>>>>>>> representable in the destination format"./
>>>>>>>
>>>>>>> thanks for finding this out.
>>>>>>>
>>>>>>>>
>>>>>>>> Floating point add propagates a NaN. There is no conversion in the
>>>>>>>> context of
>>>>>>>> LLVM's fadd. So, if %x in "fadd %x, -0.0" is a NaN, the result is also a
>>>>>>>> NaN
>>>>>>>> with the same payload.
>>>>>>>
>>>>>>> Yes, folding "fadd %x, -0.0" to "%x" is correct.  This implies that "fadd
>>>>>>> undef, undef" can be folded to "undef".
>>>>>>>
>>>>>>>>
>>>>>>>> As regards "fadd %x, undef", where %x might be a NaN and undef might be
>>>>>>>> chosen
>>>>>>>> to be (probably some different) NaN, and a possibility to fold this to a
>>>>>>>> constant (NaN), the standard says:
>>>>>>>> /"If two or more inputs are NaN, then the payload of the resulting NaN
>>>>>>>> should be
>>>>>>>> identical to the payload of one of the input NaNs if representable in the
>>>>>>>> destination format. *This standard does not specify which of the input
>>>>>>>> NaNs will
>>>>>>>> provide the payload*"/.
>>>>>>>>
>>>>>>>> Thus, this makes it possible to fold "fadd %x, undef" to a NaN. Is this
>>>>>>>> right?
>>>>>>>
>>>>>>> Yes, I agree.
>>>>>>>
>>>>>>> Ciao, Duncan.
>>>>>>>
>>>>>>>>
>>>>>>>> Oleg
>>>>>>>>
>>>>>>>> On 01.09.2014 10:04, Duncan Sands wrote:
>>>>>>>>> Hi Oleg,
>>>>>>>>>
>>>>>>>>> On 01/09/14 15:42, Oleg Ranevskyy wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Thank you for your comment, Owen.
>>>>>>>>>> My LLVM expertise is certainly not enough to make such decisions yet.
>>>>>>>>>> Duncan, do you have any comments on this or do you know anyone else
>>>>>>>>>> who can
>>>>>>>>>> decide about preserving NaN payloads?
>>>>>>>>>
>>>>>>>>> my take is that the first thing to do is to see what the IEEE standard
>>>>>>>>> says
>>>>>>>>> about NaNs.  Consider for example "fadd x, -0.0". Does the standard
>>>>>>>>> specify
>>>>>>>>> the exact NaN bit pattern produced as output when a particular NaN x is
>>>>>>>>> input?  Or does it just say that the output is a NaN? If the standard
>>>>>>>>> doesn't
>>>>>>>>> care exactly which NaN is output, I think it is reasonable for LLVM to
>>>>>>>>> assume
>>>>>>>>> it is whatever NaN is most convenient for LLVM; in this case that means
>>>>>>>>> using
>>>>>>>>> x itself as the output.
>>>>>>>>>
>>>>>>>>> However this approach does implicitly mean that we may end up not folding
>>>>>>>>> floating point operations completely deterministically: depending on the
>>>>>>>>> optimization that kicks in, in one case we might fold to NaN A, and in
>>>>>>>>> some
>>>>>>>>> different optimization we might fold the same expression to NaN B.  I
>>>>>>>>> think
>>>>>>>>> this is pretty reasonable, but it is something to be aware of.
>>>>>>>>>
>>>>>>>>> Ciao, Duncan.
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>