[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

Wed Sep 24 06:59:58 PDT 2014

Hi Oleg,

On 23/09/14 17:32, Oleg Ranevskyy wrote:
> Hi Duncan,
>
> On 23.09.2014 17:58, Duncan Sands wrote:
>> Hi Oleg,
>>
>> On 22/09/14 17:56, Oleg Ranevskyy wrote:
>>> Hi Duncan,
>>>
>>> On 17.09.2014 21:10, Duncan Sands wrote:
>>>> Hi Oleg,
>>>>
>>>> On 17/09/14 18:45, Oleg Ranevskyy wrote:
>>>>> Hi,
>>>>>
>>>>> Thank you for all your helpful comments.
>>>>>
>>>>> To sum up, below is the list of correct folding examples for fadd:
>>>>>         (1)  fadd %x, -0.0            ->     %x
>>>>>         (2)  fadd undef, undef    ->     undef
>>>>>         (3)  fadd %x, undef         ->     NaN  (undef is a NaN which is
>>>>> propagated)
>>>>>
>>>>> Looking through the code I found the "NoNaNs" flag accessed through an
>>>>> instance
>>>>> of the FastMathFlags class.
>>>>> (2) and (3) should probably depend on it.
>>>>> If the flag is set, (2) and (3) cannot be folded as there are no NaNs and
>>>>> we are
>>>>> not guaranteed to get an arbitrary bit pattern from fadd, right?
>>>>
>>>> I think it's exactly the other way round: if NoNans is set then you can fold
>>>> (2) and (3) to undef.  That's because (IIRC) the NoNans flag promises that no
>>>> NaNs will be used by the program. However "undef" could be a NaN, thus the
>>>> promise is broken, meaning the program is performing undefined behaviour, and
>>>> you can do whatever you want.
>>> Oh, I see the point now. I thought if NoNaNs was set then no NaNs were possible
>>> at all. But undef is still an arbitrary bit pattern that might occasionally be
>>> the same as the one of a NaN. Thank you for the explanation.
>>>
>>> Thus, "fadd/fsub/fmul/fdiv undef, undef" can always be folded to undef, whereas
>>> "fadd/fsub/fmul/fdiv %x, undef" is folded to either undef (NoNaNs is set) or a
>>> NaN (NoNaNs is not set).
>>
>> for fmul and fdiv, the reasoning does depend on fmul %x, 1.0 always being
>> equal to %x (likewise: fdiv %x, 1.0 being equal to %x).  Is this true?
> Do you mean that we can't apply "fmul/fdiv undef, undef" to undef folding if
> "fmul/fdiv %x, 1.0" is not guaranteed to be %x?
> If we choose one undef to have an arbitrary bit pattern and another undef = 1.0,
> we need a guarantee to get the bit pattern of the first undef. Do I get it right?

yes.  Of course, if "fmul %x, 1.0" isn't always equal to %x then maybe there is 
some other reasoning that justifies the undef folding, but there does need to be 
a rigorous argument justifying it.

>
> I checked the standard regarding "x*1.0 == x" and found that only "10.4 Literal
> meaning and value-changing optimizations" addresses this. I don't pretend to
> thoroughly understand this paragraph yet, but it seems to me that language
> standards are  required to preserve the literal meaning of the source code.
> Applying the identity property x*1 is a part of this. Here is a quote from IEEE-754:
>
> /"The following value-changing transformations, among others, preserve the
> literal meaning of the source//
> //code://
> //― Applying the identity property 0 + x when x is not zero and is not a
> signaling NaN and the result//
> //has the same exponent as x.//
> //― Applying the identity property 1 × x when x is not a signaling NaN and the
> result has the same//
> //exponent as x."//
> //
> /Maybe Owen or Stephen would be able to clarify this.

Yes, hopefully they will step in.  It does seem to be saying that it is OK to 
simplify 1 * x to x, in which case we can say that we are doing this 
simplification and bob's your uncle.

Ciao, Duncan.

>
> Thank you.
> Oleg
>>
>> Ciao, Duncan.
>>
>>>
>>> Oleg
>>>>
>>>>>
>>>>> Other arithmetic FP operations (fsub, fmul, fdiv) also propagate NaNs.
>>>>> Thus, the
>>>>> same rules seem applicable to them as well:
>>>>> ---------------------------------------------------------------------
>>>>> - fdiv:
>>>>>         (4) "fdiv %x, undef" is now folded to undef.
>>>>
>>>> But should be folded to NaN, not undef.
>>>>
>>>>>         The code comment states this is done because undef might be a sNaN. We
>>>>> can't rely on sNaNs as they can either be masked or the platform might not
>>>>> have
>>>>> FP exceptions at all. Nevertheless, such folding is still correct due to
>>>>> the NaN
>>>>> propagation rules we found in the Standard - undef might be chosen to be a NaN
>>>>> and its payload will be propagated.
>>>>>         Moreover, this looks similar to (3) and can be folded to a NaN. /Is it
>>>>> worth doing?/
>>>>
>>>> As the current folding to undef is wrong, it has to be fixed.
>>>>
>>>>>
>>>>>         (5) fdiv undef, undef    ->    undef
>>>>
>>>> Yup.
>>>>
>>>>> ---------------------------------------------------------------------
>>>>> - fmul:
>>>>>         (6) fmul undef, undef    ->    undef
>>>>
>>>> Yup.
>>>>
>>>>>         (7) fmul %x, undef -> NaN or undef (undef is a NaN, which is
>>>>> propagated)
>>>>
>>>> Should be folded to NaN, not undef.
>>>>
>>>>> ---------------------------------------------------------------------
>>>>> - fsub:
>>>>>         (8) fsub %x, -0.0           ->    %x  (if %x is not -0.0; works
>>>>> this way
>>>>> now)
>>>>
>>>> Should this be: fsub %x, +0.0 ?
>>> fsub %x, +0.0 is also covered and always folded to %x.
>>> The version with -0.0 is similar except it additionally checks if %x is not
>>> -0.0.
>>>>
>>>>>         (9) fsub %x, undef -> NaN or undef (undef is a NaN, which is
>>>>> propagated)
>>>>
>>>> Should fold to NaN not undef.
>>>>
>>>>>       (10) fsub undef, undef   -> undef
>>>>
>>>> Yup.
>>>>
>>>> Ciao, Duncan.
>>>>
>>>>> ---------------------------------------------------------------------
>>>>>
>>>>> I will be very thankful if you could review this final summary and share your
>>>>> thoughts.
>>>>>
>>>>> Thank you.
>>>>>
>>>>> P.S. Sorry for bothering you again and again.
>>>>> Just want to make sure I clearly understand the subject in order to make
>>>>> correct
>>>>> code changes and to be able to help others with this in the future.
>>>>>
>>>>> Kind regards,
>>>>> Oleg
>>>>>
>>>>> On 16.09.2014 21:42, Duncan Sands wrote:
>>>>>> On 16/09/14 19:37, Owen Anderson wrote:
>>>>>>> As far as I know, LLVM does not try very hard to guarantee constant folded
>>>>>>> NaN payloads that match exactly what the target would generate.
>>>>>>
>>>>>> I'm with Owen here.  Unless ARM people object, I think it is reasonable to
>>>>>> say
>>>>>> that at the LLVM IR level we may assume that the IEEE rules are followed.
>>>>>>
>>>>>> Ciao, Duncan.
>>>>>>
>>>>>>>
>>>>>>> —Owen
>>>>>>>
>>>>>>>> On Sep 16, 2014, at 10:30 AM, Oleg Ranevskyy <llvm.mail.list at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Duncan,
>>>>>>>>
>>>>>>>> I reread everything we've discussed so far and would like to pay closer
>>>>>>>> attention to the the ARM's FPSCR register mentioned by Stephen.
>>>>>>>> It's really possible on ARM systems that floating point operations on
>>>>>>>> one or
>>>>>>>> more qNaN operands return a NaN different from the operands. I.e. operand
>>>>>>>> NaN is not propagated. This happens when the "default NaN" flag is set in
>>>>>>>> the FPSCR (floating point status and control register). The result in this
>>>>>>>> case is some default NaN value.
>>>>>>>>
>>>>>>>> This means "fadd %x, -0.0", which is currently folded to %x by
>>>>>>>> InstructionSimplify, might produce a different result if %x is a NaN. This
>>>>>>>> breaks the NaN propagation rules the IEEE standard establishes and
>>>>>>>> significantly reduces folding capabilities for the FP operations.
>>>>>>>>
>>>>>>>> This also applies to "fadd undef, undef" and "fadd %x, undef". We can't
>>>>>>>> rely
>>>>>>>> on getting an arbitrary NaN here on ARMs.
>>>>>>>>
>>>>>>>> Would you be able to confirm this please?
>>>>>>>>
>>>>>>>> Thank you in advance for your time!
>>>>>>>>
>>>>>>>> Kind regards,
>>>>>>>> Oleg
>>>>>>>>
>>>>>>>> On 10.09.2014 22:50, Duncan Sands wrote:
>>>>>>>>> Hi Oleg,
>>>>>>>>>
>>>>>>>>> On 01/09/14 18:46, Oleg Ranevskyy wrote:
>>>>>>>>>> Hi Duncan,
>>>>>>>>>>
>>>>>>>>>> I looked through the IEEE standard and here is what I found:
>>>>>>>>>>
>>>>>>>>>> *6.2 Operations with NaNs*
>>>>>>>>>> /"For an operation with quiet NaN inputs, other than maximum and minimum
>>>>>>>>>> operations, if a floating-point result is to be delivered the result
>>>>>>>>>> shall
>>>>>>>>>> be a
>>>>>>>>>> quiet NaN which should be one of the input NaNs"/.
>>>>>>>>>>
>>>>>>>>>> *6.2.3 NaN propagation*
>>>>>>>>>> /"An operation that propagates a NaN operand to its result and has a
>>>>>>>>>> single NaN
>>>>>>>>>> as an input should produce a NaN with the payload of the input NaN if
>>>>>>>>>> representable in the destination format"./
>>>>>>>>>
>>>>>>>>> thanks for finding this out.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Floating point add propagates a NaN. There is no conversion in the
>>>>>>>>>> context of
>>>>>>>>>> LLVM's fadd. So, if %x in "fadd %x, -0.0" is a NaN, the result is also a
>>>>>>>>>> NaN
>>>>>>>>>> with the same payload.
>>>>>>>>>
>>>>>>>>> Yes, folding "fadd %x, -0.0" to "%x" is correct. This implies that "fadd
>>>>>>>>> undef, undef" can be folded to "undef".
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> As regards "fadd %x, undef", where %x might be a NaN and undef might be
>>>>>>>>>> chosen
>>>>>>>>>> to be (probably some different) NaN, and a possibility to fold this to a
>>>>>>>>>> constant (NaN), the standard says:
>>>>>>>>>> /"If two or more inputs are NaN, then the payload of the resulting NaN
>>>>>>>>>> should be
>>>>>>>>>> identical to the payload of one of the input NaNs if representable in the
>>>>>>>>>> destination format. *This standard does not specify which of the input
>>>>>>>>>> NaNs will
>>>>>>>>>> provide the payload*"/.
>>>>>>>>>>
>>>>>>>>>> Thus, this makes it possible to fold "fadd %x, undef" to a NaN. Is this
>>>>>>>>>> right?
>>>>>>>>>
>>>>>>>>> Yes, I agree.
>>>>>>>>>
>>>>>>>>> Ciao, Duncan.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Oleg
>>>>>>>>>>
>>>>>>>>>> On 01.09.2014 10:04, Duncan Sands wrote:
>>>>>>>>>>> Hi Oleg,
>>>>>>>>>>>
>>>>>>>>>>> On 01/09/14 15:42, Oleg Ranevskyy wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for your comment, Owen.
>>>>>>>>>>>> My LLVM expertise is certainly not enough to make such decisions yet.
>>>>>>>>>>>> Duncan, do you have any comments on this or do you know anyone else
>>>>>>>>>>>> who can
>>>>>>>>>>>> decide about preserving NaN payloads?
>>>>>>>>>>>
>>>>>>>>>>> my take is that the first thing to do is to see what the IEEE standard
>>>>>>>>>>> says
>>>>>>>>>>> about NaNs.  Consider for example "fadd x, -0.0". Does the standard
>>>>>>>>>>> specify
>>>>>>>>>>> the exact NaN bit pattern produced as output when a particular NaN x is
>>>>>>>>>>> input?  Or does it just say that the output is a NaN? If the standard
>>>>>>>>>>> doesn't
>>>>>>>>>>> care exactly which NaN is output, I think it is reasonable for LLVM to
>>>>>>>>>>> assume
>>>>>>>>>>> it is whatever NaN is most convenient for LLVM; in this case that means
>>>>>>>>>>> using
>>>>>>>>>>> x itself as the output.
>>>>>>>>>>>
>>>>>>>>>>> However this approach does implicitly mean that we may end up not
>>>>>>>>>>> folding
>>>>>>>>>>> floating point operations completely deterministically: depending on the
>>>>>>>>>>> optimization that kicks in, in one case we might fold to NaN A, and in
>>>>>>>>>>> some
>>>>>>>>>>> different optimization we might fold the same expression to NaN B.  I
>>>>>>>>>>> think
>>>>>>>>>>> this is pretty reasonable, but it is something to be aware of.
>>>>>>>>>>>
>>>>>>>>>>> Ciao, Duncan.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>