[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
Mehdi Amini
joker.eph at gmail.com
Tue Jan 6 16:07:52 PST 2015
Hi Oleg,
What is the status on this?
Did you move on with a patch?
See also below:
On 9/23/14 8:32 AM, Oleg Ranevskyy wrote:
> Hi Duncan,
>
> On 23.09.2014 17:58, Duncan Sands wrote:
>> Hi Oleg,
>>
>> On 22/09/14 17:56, Oleg Ranevskyy wrote:
>>> Hi Duncan,
>>>
>>> On 17.09.2014 21:10, Duncan Sands wrote:
>>>> Hi Oleg,
>>>>
>>>> On 17/09/14 18:45, Oleg Ranevskyy wrote:
>>>>> Hi,
>>>>>
>>>>> Thank you for all your helpful comments.
>>>>>
>>>>> To sum up, below is the list of correct folding examples for fadd:
>>>>> (1) fadd %x, -0.0 -> %x
>>>>> (2) fadd undef, undef -> undef
>>>>> (3) fadd %x, undef -> NaN (undef is a NaN
>>>>> which is
>>>>> propagated)
>>>>>
>>>>> Looking through the code I found the "NoNaNs" flag accessed
>>>>> through an instance
>>>>> of the FastMathFlags class.
>>>>> (2) and (3) should probably depend on it.
>>>>> If the flag is set, (2) and (3) cannot be folded as there are no
>>>>> NaNs and we are
>>>>> not guaranteed to get an arbitrary bit pattern from fadd, right?
>>>>
>>>> I think it's exactly the other way round: if NoNans is set then you
>>>> can fold
>>>> (2) and (3) to undef. That's because (IIRC) the NoNans flag
>>>> promises that no
>>>> NaNs will be used by the program. However "undef" could be a NaN,
>>>> thus the
>>>> promise is broken, meaning the program is performing undefined
>>>> behaviour, and
>>>> you can do whatever you want.
>>> Oh, I see the point now. I thought if NoNaNs was set then no NaNs
>>> were possible
>>> at all. But undef is still an arbitrary bit pattern that might
>>> occasionally be
>>> the same as the one of a NaN. Thank you for the explanation.
>>>
>>> Thus, "fadd/fsub/fmul/fdiv undef, undef" can always be folded to
>>> undef, whereas
>>> "fadd/fsub/fmul/fdiv %x, undef" is folded to either undef (NoNaNs is
>>> set) or a
>>> NaN (NoNaNs is not set).
>>
>> for fmul and fdiv, the reasoning does depend on fmul %x, 1.0 always
>> being equal to %x (likewise: fdiv %x, 1.0 being equal to %x). Is
>> this true?
> Do you mean that we can't apply "fmul/fdiv undef, undef" to undef
> folding if "fmul/fdiv %x, 1.0" is not guaranteed to be %x?
> If we choose one undef to have an arbitrary bit pattern and another
> undef = 1.0, we need a guarantee to get the bit pattern of the first
> undef. Do I get it right?
I don't think so. I don't think it makes no sense to try to think about
each individual case like that.
If you consider returning an undef, I believe the question is "can you
form all the possible bit pattern that the undef represents from the
expression?"
if you have:
op float undef, undef
then, since the inputs are unconstrained, the question is can "op" form
any possible float value as an output?
If yes then folding to undef is valid.
For instance fabs(undef) can't fold to undef because the possible range
of output is larger that what fabs can produce.
--
Mehdi
>
> I checked the standard regarding "x*1.0 == x" and found that only
> "10.4 Literal meaning and value-changing optimizations" addresses
> this. I don't pretend to thoroughly understand this paragraph yet, but
> it seems to me that language standards are required to preserve the
> literal meaning of the source code. Applying the identity property x*1
> is a part of this. Here is a quote from IEEE-754:
>
> /"The following value-changing transformations, among others, preserve
> the literal meaning of the source//
> //code://
> //― Applying the identity property 0 + x when x is not zero and is not
> a signaling NaN and the result//
> //has the same exponent as x.//
> //― Applying the identity property 1 × x when x is not a signaling NaN
> and the result has the same//
> //exponent as x."//
> //
> /Maybe Owen or Stephen would be able to clarify this.
>
> Thank you.
> Oleg
>>
>> Ciao, Duncan.
>>
>>>
>>> Oleg
>>>>
>>>>>
>>>>> Other arithmetic FP operations (fsub, fmul, fdiv) also propagate
>>>>> NaNs. Thus, the
>>>>> same rules seem applicable to them as well:
>>>>> ---------------------------------------------------------------------
>>>>> - fdiv:
>>>>> (4) "fdiv %x, undef" is now folded to undef.
>>>>
>>>> But should be folded to NaN, not undef.
>>>>
>>>>> The code comment states this is done because undef might
>>>>> be a sNaN. We
>>>>> can't rely on sNaNs as they can either be masked or the platform
>>>>> might not have
>>>>> FP exceptions at all. Nevertheless, such folding is still correct
>>>>> due to the NaN
>>>>> propagation rules we found in the Standard - undef might be chosen
>>>>> to be a NaN
>>>>> and its payload will be propagated.
>>>>> Moreover, this looks similar to (3) and can be folded to a
>>>>> NaN. /Is it
>>>>> worth doing?/
>>>>
>>>> As the current folding to undef is wrong, it has to be fixed.
>>>>
>>>>>
>>>>> (5) fdiv undef, undef -> undef
>>>>
>>>> Yup.
>>>>
>>>>> ---------------------------------------------------------------------
>>>>> - fmul:
>>>>> (6) fmul undef, undef -> undef
>>>>
>>>> Yup.
>>>>
>>>>> (7) fmul %x, undef -> NaN or undef (undef is a NaN, which is
>>>>> propagated)
>>>>
>>>> Should be folded to NaN, not undef.
>>>>
>>>>> ---------------------------------------------------------------------
>>>>> - fsub:
>>>>> (8) fsub %x, -0.0 -> %x (if %x is not -0.0;
>>>>> works this way
>>>>> now)
>>>>
>>>> Should this be: fsub %x, +0.0 ?
>>> fsub %x, +0.0 is also covered and always folded to %x.
>>> The version with -0.0 is similar except it additionally checks if %x
>>> is not -0.0.
>>>>
>>>>> (9) fsub %x, undef -> NaN or undef (undef is a NaN, which is
>>>>> propagated)
>>>>
>>>> Should fold to NaN not undef.
>>>>
>>>>> (10) fsub undef, undef -> undef
>>>>
>>>> Yup.
>>>>
>>>> Ciao, Duncan.
>>>>
>>>>> ---------------------------------------------------------------------
>>>>>
>>>>> I will be very thankful if you could review this final summary and
>>>>> share your
>>>>> thoughts.
>>>>>
>>>>> Thank you.
>>>>>
>>>>> P.S. Sorry for bothering you again and again.
>>>>> Just want to make sure I clearly understand the subject in order
>>>>> to make correct
>>>>> code changes and to be able to help others with this in the future.
>>>>>
>>>>> Kind regards,
>>>>> Oleg
>>>>>
>>>>> On 16.09.2014 21:42, Duncan Sands wrote:
>>>>>> On 16/09/14 19:37, Owen Anderson wrote:
>>>>>>> As far as I know, LLVM does not try very hard to guarantee
>>>>>>> constant folded
>>>>>>> NaN payloads that match exactly what the target would generate.
>>>>>>
>>>>>> I'm with Owen here. Unless ARM people object, I think it is
>>>>>> reasonable to say
>>>>>> that at the LLVM IR level we may assume that the IEEE rules are
>>>>>> followed.
>>>>>>
>>>>>> Ciao, Duncan.
>>>>>>
>>>>>>>
>>>>>>> —Owen
>>>>>>>
>>>>>>>> On Sep 16, 2014, at 10:30 AM, Oleg Ranevskyy
>>>>>>>> <llvm.mail.list at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Duncan,
>>>>>>>>
>>>>>>>> I reread everything we've discussed so far and would like to
>>>>>>>> pay closer
>>>>>>>> attention to the the ARM's FPSCR register mentioned by Stephen.
>>>>>>>> It's really possible on ARM systems that floating point
>>>>>>>> operations on one or
>>>>>>>> more qNaN operands return a NaN different from the operands.
>>>>>>>> I.e. operand
>>>>>>>> NaN is not propagated. This happens when the "default NaN" flag
>>>>>>>> is set in
>>>>>>>> the FPSCR (floating point status and control register). The
>>>>>>>> result in this
>>>>>>>> case is some default NaN value.
>>>>>>>>
>>>>>>>> This means "fadd %x, -0.0", which is currently folded to %x by
>>>>>>>> InstructionSimplify, might produce a different result if %x is
>>>>>>>> a NaN. This
>>>>>>>> breaks the NaN propagation rules the IEEE standard establishes and
>>>>>>>> significantly reduces folding capabilities for the FP operations.
>>>>>>>>
>>>>>>>> This also applies to "fadd undef, undef" and "fadd %x, undef".
>>>>>>>> We can't rely
>>>>>>>> on getting an arbitrary NaN here on ARMs.
>>>>>>>>
>>>>>>>> Would you be able to confirm this please?
>>>>>>>>
>>>>>>>> Thank you in advance for your time!
>>>>>>>>
>>>>>>>> Kind regards,
>>>>>>>> Oleg
>>>>>>>>
>>>>>>>> On 10.09.2014 22:50, Duncan Sands wrote:
>>>>>>>>> Hi Oleg,
>>>>>>>>>
>>>>>>>>> On 01/09/14 18:46, Oleg Ranevskyy wrote:
>>>>>>>>>> Hi Duncan,
>>>>>>>>>>
>>>>>>>>>> I looked through the IEEE standard and here is what I found:
>>>>>>>>>>
>>>>>>>>>> *6.2 Operations with NaNs*
>>>>>>>>>> /"For an operation with quiet NaN inputs, other than maximum
>>>>>>>>>> and minimum
>>>>>>>>>> operations, if a floating-point result is to be delivered the
>>>>>>>>>> result shall
>>>>>>>>>> be a
>>>>>>>>>> quiet NaN which should be one of the input NaNs"/.
>>>>>>>>>>
>>>>>>>>>> *6.2.3 NaN propagation*
>>>>>>>>>> /"An operation that propagates a NaN operand to its result
>>>>>>>>>> and has a
>>>>>>>>>> single NaN
>>>>>>>>>> as an input should produce a NaN with the payload of the
>>>>>>>>>> input NaN if
>>>>>>>>>> representable in the destination format"./
>>>>>>>>>
>>>>>>>>> thanks for finding this out.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Floating point add propagates a NaN. There is no conversion
>>>>>>>>>> in the
>>>>>>>>>> context of
>>>>>>>>>> LLVM's fadd. So, if %x in "fadd %x, -0.0" is a NaN, the
>>>>>>>>>> result is also a
>>>>>>>>>> NaN
>>>>>>>>>> with the same payload.
>>>>>>>>>
>>>>>>>>> Yes, folding "fadd %x, -0.0" to "%x" is correct. This implies
>>>>>>>>> that "fadd
>>>>>>>>> undef, undef" can be folded to "undef".
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> As regards "fadd %x, undef", where %x might be a NaN and
>>>>>>>>>> undef might be
>>>>>>>>>> chosen
>>>>>>>>>> to be (probably some different) NaN, and a possibility to
>>>>>>>>>> fold this to a
>>>>>>>>>> constant (NaN), the standard says:
>>>>>>>>>> /"If two or more inputs are NaN, then the payload of the
>>>>>>>>>> resulting NaN
>>>>>>>>>> should be
>>>>>>>>>> identical to the payload of one of the input NaNs if
>>>>>>>>>> representable in the
>>>>>>>>>> destination format. *This standard does not specify which of
>>>>>>>>>> the input
>>>>>>>>>> NaNs will
>>>>>>>>>> provide the payload*"/.
>>>>>>>>>>
>>>>>>>>>> Thus, this makes it possible to fold "fadd %x, undef" to a
>>>>>>>>>> NaN. Is this
>>>>>>>>>> right?
>>>>>>>>>
>>>>>>>>> Yes, I agree.
>>>>>>>>>
>>>>>>>>> Ciao, Duncan.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Oleg
>>>>>>>>>>
>>>>>>>>>> On 01.09.2014 10:04, Duncan Sands wrote:
>>>>>>>>>>> Hi Oleg,
>>>>>>>>>>>
>>>>>>>>>>> On 01/09/14 15:42, Oleg Ranevskyy wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for your comment, Owen.
>>>>>>>>>>>> My LLVM expertise is certainly not enough to make such
>>>>>>>>>>>> decisions yet.
>>>>>>>>>>>> Duncan, do you have any comments on this or do you know
>>>>>>>>>>>> anyone else
>>>>>>>>>>>> who can
>>>>>>>>>>>> decide about preserving NaN payloads?
>>>>>>>>>>>
>>>>>>>>>>> my take is that the first thing to do is to see what the
>>>>>>>>>>> IEEE standard
>>>>>>>>>>> says
>>>>>>>>>>> about NaNs. Consider for example "fadd x, -0.0". Does the
>>>>>>>>>>> standard
>>>>>>>>>>> specify
>>>>>>>>>>> the exact NaN bit pattern produced as output when a
>>>>>>>>>>> particular NaN x is
>>>>>>>>>>> input? Or does it just say that the output is a NaN? If the
>>>>>>>>>>> standard
>>>>>>>>>>> doesn't
>>>>>>>>>>> care exactly which NaN is output, I think it is reasonable
>>>>>>>>>>> for LLVM to
>>>>>>>>>>> assume
>>>>>>>>>>> it is whatever NaN is most convenient for LLVM; in this case
>>>>>>>>>>> that means
>>>>>>>>>>> using
>>>>>>>>>>> x itself as the output.
>>>>>>>>>>>
>>>>>>>>>>> However this approach does implicitly mean that we may end
>>>>>>>>>>> up not folding
>>>>>>>>>>> floating point operations completely deterministically:
>>>>>>>>>>> depending on the
>>>>>>>>>>> optimization that kicks in, in one case we might fold to NaN
>>>>>>>>>>> A, and in
>>>>>>>>>>> some
>>>>>>>>>>> different optimization we might fold the same expression to
>>>>>>>>>>> NaN B. I
>>>>>>>>>>> think
>>>>>>>>>>> this is pretty reasonable, but it is something to be aware of.
>>>>>>>>>>>
>>>>>>>>>>> Ciao, Duncan.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150106/1ea22bc0/attachment.html>
More information about the llvm-dev
mailing list