[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
Oleg Ranevskyy
llvm.mail.list at gmail.com
Wed Aug 27 09:17:00 PDT 2014
Duncan,
> Hi Oleg,
>
>> >> /This is either a mistake, or a decision that in LLVM IR snans
>> are always
>> considered to be signalling. /
>> Yes, this seems to be an agreement to treat "undef" as a SNaN for
>> "fdiv".
>
> "undef" is whatever bit pattern you want it to be, i.e. the compiler
> can assume it is any convenient value. For one fdiv optimization
> maybe it is convenient to choose it to be the bit pattern of an snan,
> for some other fdiv optimization the compiler can choose it to be 1.0
> if that is convenient.
>
>> The question is whether we can make the same assumption for other
>> floating point
>> operations, or "fdiv" needs a correction to prevent folding since
>> signalling of
>> SNaNs might be disabled.
>
> You can assume that undef is an snan if you want in any floating point
> operation. But what does that assumption buy you? If you are willing
> to assume that the processor will trap on snans then it buys you a lot
> for any floating point operation, but if you are not willing to assume
> that then you need to find some other trick or give up on folding the
> operation to undef.
This is confusing me a bit.
On the one hand, we can assume undef to be an SNaN for the sake of
folding. InstructionSimplify makes such an assumption and folds "fdiv
undef, %X" and "fdiv %X, undef" to undef.
On the other hand, there are the requests to fold "fmul undef, undef" /
"fadd undef, undef" to undef as well. However, it was stated that such
folding is questionable as signalling of SNaNs can potentially be disabled.
Shouldn't fdiv / fmul / fadd be consistent as regards the assumptions
they make?
Could you please advise how we should proceed with the bugs 16257 and
16258 then?
>
>>
>> >> /InstructionSimplify folds "mul %X, undef" to 0 always/
>> Sorry, I malformed this line and forgot to highlight that by "%X" I
>> meant a
>> constant here. So, constant folding comes into play. The result
>> depends on the
>> constant parity. E.g.:
>> mul i64 5, undef --> undef
>> mul i64 4, undef --> 0
>> I still have difficulty understanding why the first one is folded to
>> "undef".
>> "Undef" could be zero, so the result is also zero. Besides, if
>> "undef" isn't
>> zero, it's impossible to get an arbitrary bit pattern anyway. The
>> result will
>> always be divisible by 5.
>
> No, that's not true, because we are dealing with arithmetic modulo
> 2^64 here. Since 5 and 2^64 are relatively prime, you can prove that
> given any i64 value R, there exists an i64 value X such that mul i64
> 5, X is equal to R. Here's an example using i3 arithmetic (possible
> values: 0, ..., 7).
> mul i3 5, 0 -> 0
> mul i3 5, 5 -> 1 (because 5 * 5 is 25; reduced modulo 8 that gives 1)
> mul i3 5, 2 -> 2
> mul i3 5, 7 -> 3
> mul i3 5, 4 -> 4
> mul i3 5, 1 -> 5
> mul i3 5, 6 -> 6
> mul i3 5, 3 -> 7
> So you see that every possible i3 output is achievable. Thus it is
> correct to fold "mul i3 5, undef" to undef. The same goes for any
> other number that is relatively prime to 2, i.e. is an odd number.
Yes, this explanation is absolutely reasonable. Thanks!
>
> Ciao, Duncan.
>
>>
>> Would you be able to elaborate on this please?
>> Thank you!
>>
>> Kind regards,
>> Oleg
>>
>> On 26.08.2014 15:47, Duncan Sands wrote:
>>> Hi Oleg,
>>>
>>> On 26/08/14 21:20, Oleg Ranevskyy wrote:
>>>> Hi Duncan,
>>>>
>>>> Thank you for your comment to the bug 16257.
>>>>
>>>> I am new to LLVM, so not all the aspects of LLVM's /"undef"/ seem
>>>> clear to me
>>>> yet.
>>>> I read and understood the examples from the LLVM documentation:
>>>> http://llvm.org/docs/LangRef.html#undefined-values
>>>>
>>>> However, those examples do not cover all of the possible contexts
>>>> where
>>>> /"undef"/ can appear.
>>>> E.g., I can't realize why /"fmul undef, undef"/ may not result
>>>> in/"undef"/
>>>> whereas /"mul undef, undef"/ is always folded to /"undef"/. Seems I
>>>> am missing
>>>> something important about floats here.
>>>
>>> in the case of mul undef, undef, the two inputs may be anything. In
>>> order to
>>> fold the mul to undef you have to prove that the output may be
>>> anything (any
>>> bit pattern). For example it would be wrong to fold "mul 0, undef"
>>> to undef
>>> since no matter what value you substitute for "undef" the output of
>>> the mul
>>> will always be zero. Note that in something like "mul undef,
>>> undef", where
>>> undef occurs more than once, you are allowed to assume that the two
>>> undefs are
>>> independent of each other, uncorrelated, i.e. in your reasoning you
>>> can insert
>>> any bit pattern for the first undef and any other bit pattern for
>>> the second
>>> undef.
>>>
>>> Here is a proof that "mul undef, undef" is undef. Let R represent an
>>> arbitrary bit pattern. We have to find two bit patterns X and Y
>>> such that
>>> "mul X, Y" is equal to R. For example set X to 1 and Y to R. This
>>> ends the
>>> proof.
>>>
>>> Now consider "fmul undef, undef". You could try to apply the same
>>> reasoning,
>>> and maybe it is OK. Let R represent an arbitrary bit pattern. Set
>>> Y = R and
>>> set X to be the bit pattern for the floating point number 1.0. Then
>>> "fmul X,
>>> Y" is equal to R, proving that "fmul undef, undef" can be folded to
>>> "undef".
>>> But... is this correct? Is it always true that "fmul 1.0, R" always
>>> has
>>> exactly the same bits as R (this is not the same as the output
>>> comparing equal
>>> to R using a floating point comparison)? After all, R might be
>>> something
>>> funky like a signed zero, an infinity, or a NaN. I don't know if it
>>> is always
>>> true if "fmul 1.0, R" always has the same bits as R, hopefully a
>>> floating
>>> point expert can tell us.
>>>
>>> You could always argue that you could choose "undef" to be a
>>> signalling NaN,
>>> causing the program to have undefined behaviour at the point of the
>>> multiplication, in which case you could do anything, but kindly
>>> instead just
>>> fold the fmul to undef. I don't much like using snans like this
>>> though since
>>> they can be made non-signalling by poking the processor AFAIK.
>>>
>>>>
>>>> The same applies to "fdiv".
>>>> /"fdiv undef, undef"/ is not folded but /"div %X, undef"/ and /"div
>>>> undef, %X"/
>>>> are folded to /"undef"/ (SimplifyFDivInst function in
>>>> lib/Analysis/InstructionSimplify.cpp).
>>>
>>> Here is the argument for "div %X, undef -> undef". The undef value
>>> might be
>>> zero (div %X, 0), so lets choose it to be zero. That means that the
>>> program
>>> has undefined behaviour at this point, and so we could fold the div to
>>> "exit(1)" or "launch nuclear missile". But instead we only fold it
>>> to undef.
>>>
>>> You are wrong to say that "div undef, %X" is folded to "undef" by
>>> InstructionSimplify, it is folded to zero. It would be wrong to
>>> fold to
>>> undef, since (eg) if %X is 2 then you can't obtain an arbitrary bit
>>> pattern as
>>> the output.
>>>
>>> Fdiv is harder than div because a floating point division by 0.0 has
>>> a defined
>>> result, unlike for div.
>>>
>>> Moreover, SimplifyFDivInst does not take
>>>> into account whether signalling of SNaNs can be switched off or not
>>>> - it always
>>>> folds if one of the operands is /"undef"/.
>>>
>>> This is either a mistake, or a decision that in LLVM IR snans are
>>> always
>>> considered to be signalling.
>>>
>>>>
>>>> Another mysterious thing for me is folding of /"mul %X, undef"/.
>>>> The result depends on whether %X is odd or even:
>>>>
>>>> * "undef" if %X is odd or equal to "undef";
>>>> * 0 otherwise.
>>>
>>> InstructionSimplify folds "mul %X, undef" to 0 always, since after
>>> all "undef"
>>> could be zero, in which case the output is 0. It would be wrong to
>>> fold it to
>>> undef, as you point out, and it isn't folded to undef.
>>>
>>>>
>>>> There is a similar bug 16258 about folding of /"fadd undef,
>>>> undef"/. /"Add"
>>>> /gets folded to /"undef"/ if one or both its operands are /"undef"/.
>>>> Should /"fadd"/ behave differently than integer /"add"/ too?
>>>
>>> It's the same problem. It is easy to show that "add undef, %X" can
>>> produce
>>> any desired bit pattern as output, but it is less clear whether that
>>> is true
>>> for fadd, unless you use the snan argument.
>>>
>>> Ciao, Duncan.
>>>
>>>>
>>>> I've tried to google these questions, scanned StackOverflow, but
>>>> didn't find any
>>>> good articles / posts answering them. LLVM's /"undef"/ is covered
>>>> quite poorly.
>>>>
>>>> Duncan, do you know of any web resources explaining this in more
>>>> detail?
>>>>
>>>> Thank you in advance for any help.
>>>>
>>>> Kind regards,
>>>> Oleg
>>>
>>
>
More information about the llvm-dev
mailing list