[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
Oleg Ranevskyy
llvm.mail.list at gmail.com
Tue Sep 23 08:32:04 PDT 2014
Hi Duncan,
On 23.09.2014 17:58, Duncan Sands wrote:
> Hi Oleg,
>
> On 22/09/14 17:56, Oleg Ranevskyy wrote:
>> Hi Duncan,
>>
>> On 17.09.2014 21:10, Duncan Sands wrote:
>>> Hi Oleg,
>>>
>>> On 17/09/14 18:45, Oleg Ranevskyy wrote:
>>>> Hi,
>>>>
>>>> Thank you for all your helpful comments.
>>>>
>>>> To sum up, below is the list of correct folding examples for fadd:
>>>> (1) fadd %x, -0.0 -> %x
>>>> (2) fadd undef, undef -> undef
>>>> (3) fadd %x, undef -> NaN (undef is a NaN
>>>> which is
>>>> propagated)
>>>>
>>>> Looking through the code I found the "NoNaNs" flag accessed through
>>>> an instance
>>>> of the FastMathFlags class.
>>>> (2) and (3) should probably depend on it.
>>>> If the flag is set, (2) and (3) cannot be folded as there are no
>>>> NaNs and we are
>>>> not guaranteed to get an arbitrary bit pattern from fadd, right?
>>>
>>> I think it's exactly the other way round: if NoNans is set then you
>>> can fold
>>> (2) and (3) to undef. That's because (IIRC) the NoNans flag
>>> promises that no
>>> NaNs will be used by the program. However "undef" could be a NaN,
>>> thus the
>>> promise is broken, meaning the program is performing undefined
>>> behaviour, and
>>> you can do whatever you want.
>> Oh, I see the point now. I thought if NoNaNs was set then no NaNs
>> were possible
>> at all. But undef is still an arbitrary bit pattern that might
>> occasionally be
>> the same as the one of a NaN. Thank you for the explanation.
>>
>> Thus, "fadd/fsub/fmul/fdiv undef, undef" can always be folded to
>> undef, whereas
>> "fadd/fsub/fmul/fdiv %x, undef" is folded to either undef (NoNaNs is
>> set) or a
>> NaN (NoNaNs is not set).
>
> for fmul and fdiv, the reasoning does depend on fmul %x, 1.0 always
> being equal to %x (likewise: fdiv %x, 1.0 being equal to %x). Is this
> true?
Do you mean that we can't apply "fmul/fdiv undef, undef" to undef
folding if "fmul/fdiv %x, 1.0" is not guaranteed to be %x?
If we choose one undef to have an arbitrary bit pattern and another
undef = 1.0, we need a guarantee to get the bit pattern of the first
undef. Do I get it right?
I checked the standard regarding "x*1.0 == x" and found that only "10.4
Literal meaning and value-changing optimizations" addresses this. I
don't pretend to thoroughly understand this paragraph yet, but it seems
to me that language standards are required to preserve the literal
meaning of the source code. Applying the identity property x*1 is a part
of this. Here is a quote from IEEE-754:
/"The following value-changing transformations, among others, preserve
the literal meaning of the source//
//code://
//― Applying the identity property 0 + x when x is not zero and is not a
signaling NaN and the result//
//has the same exponent as x.//
//― Applying the identity property 1 × x when x is not a signaling NaN
and the result has the same//
//exponent as x."//
//
/Maybe Owen or Stephen would be able to clarify this.
Thank you.
Oleg
>
> Ciao, Duncan.
>
>>
>> Oleg
>>>
>>>>
>>>> Other arithmetic FP operations (fsub, fmul, fdiv) also propagate
>>>> NaNs. Thus, the
>>>> same rules seem applicable to them as well:
>>>> ---------------------------------------------------------------------
>>>> - fdiv:
>>>> (4) "fdiv %x, undef" is now folded to undef.
>>>
>>> But should be folded to NaN, not undef.
>>>
>>>> The code comment states this is done because undef might be
>>>> a sNaN. We
>>>> can't rely on sNaNs as they can either be masked or the platform
>>>> might not have
>>>> FP exceptions at all. Nevertheless, such folding is still correct
>>>> due to the NaN
>>>> propagation rules we found in the Standard - undef might be chosen
>>>> to be a NaN
>>>> and its payload will be propagated.
>>>> Moreover, this looks similar to (3) and can be folded to a
>>>> NaN. /Is it
>>>> worth doing?/
>>>
>>> As the current folding to undef is wrong, it has to be fixed.
>>>
>>>>
>>>> (5) fdiv undef, undef -> undef
>>>
>>> Yup.
>>>
>>>> ---------------------------------------------------------------------
>>>> - fmul:
>>>> (6) fmul undef, undef -> undef
>>>
>>> Yup.
>>>
>>>> (7) fmul %x, undef -> NaN or undef (undef is a NaN, which is
>>>> propagated)
>>>
>>> Should be folded to NaN, not undef.
>>>
>>>> ---------------------------------------------------------------------
>>>> - fsub:
>>>> (8) fsub %x, -0.0 -> %x (if %x is not -0.0;
>>>> works this way
>>>> now)
>>>
>>> Should this be: fsub %x, +0.0 ?
>> fsub %x, +0.0 is also covered and always folded to %x.
>> The version with -0.0 is similar except it additionally checks if %x
>> is not -0.0.
>>>
>>>> (9) fsub %x, undef -> NaN or undef (undef is a NaN, which is
>>>> propagated)
>>>
>>> Should fold to NaN not undef.
>>>
>>>> (10) fsub undef, undef -> undef
>>>
>>> Yup.
>>>
>>> Ciao, Duncan.
>>>
>>>> ---------------------------------------------------------------------
>>>>
>>>> I will be very thankful if you could review this final summary and
>>>> share your
>>>> thoughts.
>>>>
>>>> Thank you.
>>>>
>>>> P.S. Sorry for bothering you again and again.
>>>> Just want to make sure I clearly understand the subject in order to
>>>> make correct
>>>> code changes and to be able to help others with this in the future.
>>>>
>>>> Kind regards,
>>>> Oleg
>>>>
>>>> On 16.09.2014 21:42, Duncan Sands wrote:
>>>>> On 16/09/14 19:37, Owen Anderson wrote:
>>>>>> As far as I know, LLVM does not try very hard to guarantee
>>>>>> constant folded
>>>>>> NaN payloads that match exactly what the target would generate.
>>>>>
>>>>> I'm with Owen here. Unless ARM people object, I think it is
>>>>> reasonable to say
>>>>> that at the LLVM IR level we may assume that the IEEE rules are
>>>>> followed.
>>>>>
>>>>> Ciao, Duncan.
>>>>>
>>>>>>
>>>>>> —Owen
>>>>>>
>>>>>>> On Sep 16, 2014, at 10:30 AM, Oleg Ranevskyy
>>>>>>> <llvm.mail.list at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi Duncan,
>>>>>>>
>>>>>>> I reread everything we've discussed so far and would like to pay
>>>>>>> closer
>>>>>>> attention to the the ARM's FPSCR register mentioned by Stephen.
>>>>>>> It's really possible on ARM systems that floating point
>>>>>>> operations on one or
>>>>>>> more qNaN operands return a NaN different from the operands.
>>>>>>> I.e. operand
>>>>>>> NaN is not propagated. This happens when the "default NaN" flag
>>>>>>> is set in
>>>>>>> the FPSCR (floating point status and control register). The
>>>>>>> result in this
>>>>>>> case is some default NaN value.
>>>>>>>
>>>>>>> This means "fadd %x, -0.0", which is currently folded to %x by
>>>>>>> InstructionSimplify, might produce a different result if %x is a
>>>>>>> NaN. This
>>>>>>> breaks the NaN propagation rules the IEEE standard establishes and
>>>>>>> significantly reduces folding capabilities for the FP operations.
>>>>>>>
>>>>>>> This also applies to "fadd undef, undef" and "fadd %x, undef".
>>>>>>> We can't rely
>>>>>>> on getting an arbitrary NaN here on ARMs.
>>>>>>>
>>>>>>> Would you be able to confirm this please?
>>>>>>>
>>>>>>> Thank you in advance for your time!
>>>>>>>
>>>>>>> Kind regards,
>>>>>>> Oleg
>>>>>>>
>>>>>>> On 10.09.2014 22:50, Duncan Sands wrote:
>>>>>>>> Hi Oleg,
>>>>>>>>
>>>>>>>> On 01/09/14 18:46, Oleg Ranevskyy wrote:
>>>>>>>>> Hi Duncan,
>>>>>>>>>
>>>>>>>>> I looked through the IEEE standard and here is what I found:
>>>>>>>>>
>>>>>>>>> *6.2 Operations with NaNs*
>>>>>>>>> /"For an operation with quiet NaN inputs, other than maximum
>>>>>>>>> and minimum
>>>>>>>>> operations, if a floating-point result is to be delivered the
>>>>>>>>> result shall
>>>>>>>>> be a
>>>>>>>>> quiet NaN which should be one of the input NaNs"/.
>>>>>>>>>
>>>>>>>>> *6.2.3 NaN propagation*
>>>>>>>>> /"An operation that propagates a NaN operand to its result and
>>>>>>>>> has a
>>>>>>>>> single NaN
>>>>>>>>> as an input should produce a NaN with the payload of the input
>>>>>>>>> NaN if
>>>>>>>>> representable in the destination format"./
>>>>>>>>
>>>>>>>> thanks for finding this out.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Floating point add propagates a NaN. There is no conversion in
>>>>>>>>> the
>>>>>>>>> context of
>>>>>>>>> LLVM's fadd. So, if %x in "fadd %x, -0.0" is a NaN, the result
>>>>>>>>> is also a
>>>>>>>>> NaN
>>>>>>>>> with the same payload.
>>>>>>>>
>>>>>>>> Yes, folding "fadd %x, -0.0" to "%x" is correct. This implies
>>>>>>>> that "fadd
>>>>>>>> undef, undef" can be folded to "undef".
>>>>>>>>
>>>>>>>>>
>>>>>>>>> As regards "fadd %x, undef", where %x might be a NaN and undef
>>>>>>>>> might be
>>>>>>>>> chosen
>>>>>>>>> to be (probably some different) NaN, and a possibility to fold
>>>>>>>>> this to a
>>>>>>>>> constant (NaN), the standard says:
>>>>>>>>> /"If two or more inputs are NaN, then the payload of the
>>>>>>>>> resulting NaN
>>>>>>>>> should be
>>>>>>>>> identical to the payload of one of the input NaNs if
>>>>>>>>> representable in the
>>>>>>>>> destination format. *This standard does not specify which of
>>>>>>>>> the input
>>>>>>>>> NaNs will
>>>>>>>>> provide the payload*"/.
>>>>>>>>>
>>>>>>>>> Thus, this makes it possible to fold "fadd %x, undef" to a
>>>>>>>>> NaN. Is this
>>>>>>>>> right?
>>>>>>>>
>>>>>>>> Yes, I agree.
>>>>>>>>
>>>>>>>> Ciao, Duncan.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Oleg
>>>>>>>>>
>>>>>>>>> On 01.09.2014 10:04, Duncan Sands wrote:
>>>>>>>>>> Hi Oleg,
>>>>>>>>>>
>>>>>>>>>> On 01/09/14 15:42, Oleg Ranevskyy wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> Thank you for your comment, Owen.
>>>>>>>>>>> My LLVM expertise is certainly not enough to make such
>>>>>>>>>>> decisions yet.
>>>>>>>>>>> Duncan, do you have any comments on this or do you know
>>>>>>>>>>> anyone else
>>>>>>>>>>> who can
>>>>>>>>>>> decide about preserving NaN payloads?
>>>>>>>>>>
>>>>>>>>>> my take is that the first thing to do is to see what the IEEE
>>>>>>>>>> standard
>>>>>>>>>> says
>>>>>>>>>> about NaNs. Consider for example "fadd x, -0.0". Does the
>>>>>>>>>> standard
>>>>>>>>>> specify
>>>>>>>>>> the exact NaN bit pattern produced as output when a
>>>>>>>>>> particular NaN x is
>>>>>>>>>> input? Or does it just say that the output is a NaN? If the
>>>>>>>>>> standard
>>>>>>>>>> doesn't
>>>>>>>>>> care exactly which NaN is output, I think it is reasonable
>>>>>>>>>> for LLVM to
>>>>>>>>>> assume
>>>>>>>>>> it is whatever NaN is most convenient for LLVM; in this case
>>>>>>>>>> that means
>>>>>>>>>> using
>>>>>>>>>> x itself as the output.
>>>>>>>>>>
>>>>>>>>>> However this approach does implicitly mean that we may end up
>>>>>>>>>> not folding
>>>>>>>>>> floating point operations completely deterministically:
>>>>>>>>>> depending on the
>>>>>>>>>> optimization that kicks in, in one case we might fold to NaN
>>>>>>>>>> A, and in
>>>>>>>>>> some
>>>>>>>>>> different optimization we might fold the same expression to
>>>>>>>>>> NaN B. I
>>>>>>>>>> think
>>>>>>>>>> this is pretty reasonable, but it is something to be aware of.
>>>>>>>>>>
>>>>>>>>>> Ciao, Duncan.
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140923/e3790243/attachment.html>
More information about the llvm-dev
mailing list