[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

Mon Sep 22 08:56:51 PDT 2014

Hi Duncan,

On 17.09.2014 21:10, Duncan Sands wrote:
> Hi Oleg,
>
> On 17/09/14 18:45, Oleg Ranevskyy wrote:
>> Hi,
>>
>> Thank you for all your helpful comments.
>>
>> To sum up, below is the list of correct folding examples for fadd:
>>         (1)  fadd %x, -0.0            ->     %x
>>         (2)  fadd undef, undef    ->     undef
>>         (3)  fadd %x, undef         ->     NaN  (undef is a NaN which 
>> is propagated)
>>
>> Looking through the code I found the "NoNaNs" flag accessed through 
>> an instance
>> of the FastMathFlags class.
>> (2) and (3) should probably depend on it.
>> If the flag is set, (2) and (3) cannot be folded as there are no NaNs 
>> and we are
>> not guaranteed to get an arbitrary bit pattern from fadd, right?
>
> I think it's exactly the other way round: if NoNans is set then you 
> can fold (2) and (3) to undef.  That's because (IIRC) the NoNans flag 
> promises that no NaNs will be used by the program. However "undef" 
> could be a NaN, thus the promise is broken, meaning the program is 
> performing undefined behaviour, and you can do whatever you want.
Oh, I see the point now. I thought if NoNaNs was set then no NaNs were 
possible at all. But undef is still an arbitrary bit pattern that might 
occasionally be the same as the one of a NaN. Thank you for the explanation.

Thus, "fadd/fsub/fmul/fdiv undef, undef" can always be folded to undef, 
whereas "fadd/fsub/fmul/fdiv %x, undef" is folded to either undef 
(NoNaNs is set) or a NaN (NoNaNs is not set).

Oleg
>
>>
>> Other arithmetic FP operations (fsub, fmul, fdiv) also propagate 
>> NaNs. Thus, the
>> same rules seem applicable to them as well:
>> ---------------------------------------------------------------------
>> - fdiv:
>>         (4) "fdiv %x, undef" is now folded to undef.
>
> But should be folded to NaN, not undef.
>
>>         The code comment states this is done because undef might be a 
>> sNaN. We
>> can't rely on sNaNs as they can either be masked or the platform 
>> might not have
>> FP exceptions at all. Nevertheless, such folding is still correct due 
>> to the NaN
>> propagation rules we found in the Standard - undef might be chosen to 
>> be a NaN
>> and its payload will be propagated.
>>         Moreover, this looks similar to (3) and can be folded to a 
>> NaN. /Is it
>> worth doing?/
>
> As the current folding to undef is wrong, it has to be fixed.
>
>>
>>         (5) fdiv undef, undef    ->    undef
>
> Yup.
>
>> ---------------------------------------------------------------------
>> - fmul:
>>         (6) fmul undef, undef    ->    undef
>
> Yup.
>
>>         (7) fmul %x, undef        -> NaN or undef (undef is a NaN, 
>> which is
>> propagated)
>
> Should be folded to NaN, not undef.
>
>> ---------------------------------------------------------------------
>> - fsub:
>>         (8) fsub %x, -0.0           ->    %x  (if %x is not -0.0; 
>> works this way
>> now)
>
> Should this be: fsub %x, +0.0 ?
fsub %x, +0.0 is also covered and always folded to %x.
The version with -0.0 is similar except it additionally checks if %x is 
not -0.0.
>
>>         (9) fsub %x, undef        -> NaN or undef (undef is a NaN, 
>> which is
>> propagated)
>
> Should fold to NaN not undef.
>
>>       (10) fsub undef, undef   -> undef
>
> Yup.
>
> Ciao, Duncan.
>
>> ---------------------------------------------------------------------
>>
>> I will be very thankful if you could review this final summary and 
>> share your
>> thoughts.
>>
>> Thank you.
>>
>> P.S. Sorry for bothering you again and again.
>> Just want to make sure I clearly understand the subject in order to 
>> make correct
>> code changes and to be able to help others with this in the future.
>>
>> Kind regards,
>> Oleg
>>
>> On 16.09.2014 21:42, Duncan Sands wrote:
>>> On 16/09/14 19:37, Owen Anderson wrote:
>>>> As far as I know, LLVM does not try very hard to guarantee constant 
>>>> folded
>>>> NaN payloads that match exactly what the target would generate.
>>>
>>> I'm with Owen here.  Unless ARM people object, I think it is 
>>> reasonable to say
>>> that at the LLVM IR level we may assume that the IEEE rules are 
>>> followed.
>>>
>>> Ciao, Duncan.
>>>
>>>>
>>>> —Owen
>>>>
>>>>> On Sep 16, 2014, at 10:30 AM, Oleg Ranevskyy 
>>>>> <llvm.mail.list at gmail.com> wrote:
>>>>>
>>>>> Hi Duncan,
>>>>>
>>>>> I reread everything we've discussed so far and would like to pay 
>>>>> closer
>>>>> attention to the the ARM's FPSCR register mentioned by Stephen.
>>>>> It's really possible on ARM systems that floating point operations 
>>>>> on one or
>>>>> more qNaN operands return a NaN different from the operands. I.e. 
>>>>> operand
>>>>> NaN is not propagated. This happens when the "default NaN" flag is 
>>>>> set in
>>>>> the FPSCR (floating point status and control register). The result 
>>>>> in this
>>>>> case is some default NaN value.
>>>>>
>>>>> This means "fadd %x, -0.0", which is currently folded to %x by
>>>>> InstructionSimplify, might produce a different result if %x is a 
>>>>> NaN. This
>>>>> breaks the NaN propagation rules the IEEE standard establishes and
>>>>> significantly reduces folding capabilities for the FP operations.
>>>>>
>>>>> This also applies to "fadd undef, undef" and "fadd %x, undef". We 
>>>>> can't rely
>>>>> on getting an arbitrary NaN here on ARMs.
>>>>>
>>>>> Would you be able to confirm this please?
>>>>>
>>>>> Thank you in advance for your time!
>>>>>
>>>>> Kind regards,
>>>>> Oleg
>>>>>
>>>>> On 10.09.2014 22:50, Duncan Sands wrote:
>>>>>> Hi Oleg,
>>>>>>
>>>>>> On 01/09/14 18:46, Oleg Ranevskyy wrote:
>>>>>>> Hi Duncan,
>>>>>>>
>>>>>>> I looked through the IEEE standard and here is what I found:
>>>>>>>
>>>>>>> *6.2 Operations with NaNs*
>>>>>>> /"For an operation with quiet NaN inputs, other than maximum and 
>>>>>>> minimum
>>>>>>> operations, if a floating-point result is to be delivered the 
>>>>>>> result shall
>>>>>>> be a
>>>>>>> quiet NaN which should be one of the input NaNs"/.
>>>>>>>
>>>>>>> *6.2.3 NaN propagation*
>>>>>>> /"An operation that propagates a NaN operand to its result and 
>>>>>>> has a
>>>>>>> single NaN
>>>>>>> as an input should produce a NaN with the payload of the input 
>>>>>>> NaN if
>>>>>>> representable in the destination format"./
>>>>>>
>>>>>> thanks for finding this out.
>>>>>>
>>>>>>>
>>>>>>> Floating point add propagates a NaN. There is no conversion in 
>>>>>>> the context of
>>>>>>> LLVM's fadd. So, if %x in "fadd %x, -0.0" is a NaN, the result 
>>>>>>> is also a NaN
>>>>>>> with the same payload.
>>>>>>
>>>>>> Yes, folding "fadd %x, -0.0" to "%x" is correct.  This implies 
>>>>>> that "fadd
>>>>>> undef, undef" can be folded to "undef".
>>>>>>
>>>>>>>
>>>>>>> As regards "fadd %x, undef", where %x might be a NaN and undef 
>>>>>>> might be
>>>>>>> chosen
>>>>>>> to be (probably some different) NaN, and a possibility to fold 
>>>>>>> this to a
>>>>>>> constant (NaN), the standard says:
>>>>>>> /"If two or more inputs are NaN, then the payload of the 
>>>>>>> resulting NaN
>>>>>>> should be
>>>>>>> identical to the payload of one of the input NaNs if 
>>>>>>> representable in the
>>>>>>> destination format. *This standard does not specify which of the 
>>>>>>> input
>>>>>>> NaNs will
>>>>>>> provide the payload*"/.
>>>>>>>
>>>>>>> Thus, this makes it possible to fold "fadd %x, undef" to a NaN. 
>>>>>>> Is this
>>>>>>> right?
>>>>>>
>>>>>> Yes, I agree.
>>>>>>
>>>>>> Ciao, Duncan.
>>>>>>
>>>>>>>
>>>>>>> Oleg
>>>>>>>
>>>>>>> On 01.09.2014 10:04, Duncan Sands wrote:
>>>>>>>> Hi Oleg,
>>>>>>>>
>>>>>>>> On 01/09/14 15:42, Oleg Ranevskyy wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Thank you for your comment, Owen.
>>>>>>>>> My LLVM expertise is certainly not enough to make such 
>>>>>>>>> decisions yet.
>>>>>>>>> Duncan, do you have any comments on this or do you know anyone 
>>>>>>>>> else who can
>>>>>>>>> decide about preserving NaN payloads?
>>>>>>>>
>>>>>>>> my take is that the first thing to do is to see what the IEEE 
>>>>>>>> standard says
>>>>>>>> about NaNs.  Consider for example "fadd x, -0.0". Does the 
>>>>>>>> standard specify
>>>>>>>> the exact NaN bit pattern produced as output when a particular 
>>>>>>>> NaN x is
>>>>>>>> input?  Or does it just say that the output is a NaN? If the 
>>>>>>>> standard
>>>>>>>> doesn't
>>>>>>>> care exactly which NaN is output, I think it is reasonable for 
>>>>>>>> LLVM to
>>>>>>>> assume
>>>>>>>> it is whatever NaN is most convenient for LLVM; in this case 
>>>>>>>> that means
>>>>>>>> using
>>>>>>>> x itself as the output.
>>>>>>>>
>>>>>>>> However this approach does implicitly mean that we may end up 
>>>>>>>> not folding
>>>>>>>> floating point operations completely deterministically: 
>>>>>>>> depending on the
>>>>>>>> optimization that kicks in, in one case we might fold to NaN A, 
>>>>>>>> and in some
>>>>>>>> different optimization we might fold the same expression to NaN 
>>>>>>>> B.  I think
>>>>>>>> this is pretty reasonable, but it is something to be aware of.
>>>>>>>>
>>>>>>>> Ciao, Duncan.
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>