[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

Wed Aug 27 09:17:00 PDT 2014

Duncan,
> Hi Oleg,
>
>>  >> /This is either a mistake, or a decision that in LLVM IR snans 
>> are always
>> considered to be signalling. /
>> Yes, this seems to be an agreement to treat "undef" as a SNaN for 
>> "fdiv".
>
> "undef" is whatever bit pattern you want it to be, i.e. the compiler 
> can assume it is any convenient value.  For one fdiv optimization 
> maybe it is convenient to choose it to be the bit pattern of an snan, 
> for some other fdiv optimization the compiler can choose it to be 1.0 
> if that is convenient.
>
>> The question is whether we can make the same assumption for other 
>> floating point
>> operations, or "fdiv" needs a correction to prevent folding since 
>> signalling of
>> SNaNs might be disabled.
>
> You can assume that undef is an snan if you want in any floating point 
> operation.  But what does that assumption buy you?  If you are willing 
> to assume that the processor will trap on snans then it buys you a lot 
> for any floating point operation, but if you are not willing to assume 
> that then you need to find some other trick or give up on folding the 
> operation to undef.
This is confusing me a bit.
On the one hand, we can assume undef to be an SNaN for the sake of 
folding. InstructionSimplify makes such an assumption and folds "fdiv 
undef, %X" and "fdiv %X, undef" to undef.
On the other hand, there are the requests to fold "fmul undef, undef" / 
"fadd undef, undef" to undef as well. However, it was stated that such 
folding is questionable as signalling of SNaNs can potentially be disabled.
Shouldn't fdiv / fmul / fadd be consistent as regards the assumptions 
they make?
Could you please advise how we should proceed with the bugs 16257 and 
16258 then?
>
>>
>>  >> /InstructionSimplify folds "mul %X, undef" to 0 always/
>> Sorry, I malformed this line and forgot to highlight that by "%X" I 
>> meant a
>> constant here. So, constant folding comes into play. The result 
>> depends on the
>> constant parity. E.g.:
>>     mul i64 5, undef  --> undef
>>     mul i64 4, undef  --> 0
>> I still have difficulty understanding why the first one is folded to 
>> "undef".
>> "Undef" could be zero, so the result is also zero. Besides, if 
>> "undef" isn't
>> zero, it's impossible to get an arbitrary bit pattern anyway. The 
>> result will
>> always be divisible by 5.
>
> No, that's not true, because we are dealing with arithmetic modulo 
> 2^64 here. Since 5 and 2^64 are relatively prime, you can prove that 
> given any i64 value R, there exists an i64 value X such that mul i64 
> 5, X is equal to R.  Here's an example using i3 arithmetic (possible 
> values: 0, ..., 7).
>   mul i3 5, 0 -> 0
>   mul i3 5, 5 -> 1 (because 5 * 5 is 25; reduced modulo 8 that gives 1)
>   mul i3 5, 2 -> 2
>   mul i3 5, 7 -> 3
>   mul i3 5, 4 -> 4
>   mul i3 5, 1 -> 5
>   mul i3 5, 6 -> 6
>   mul i3 5, 3 -> 7
> So you see that every possible i3 output is achievable.  Thus it is 
> correct to fold "mul i3 5, undef" to undef.  The same goes for any 
> other number that is relatively prime to 2, i.e. is an odd number.
Yes, this explanation is absolutely reasonable. Thanks!
>
> Ciao, Duncan.
>
>>
>> Would you be able to elaborate on this please?
>> Thank you!
>>
>> Kind regards,
>> Oleg
>>
>> On 26.08.2014 15:47, Duncan Sands wrote:
>>> Hi Oleg,
>>>
>>> On 26/08/14 21:20, Oleg Ranevskyy wrote:
>>>> Hi Duncan,
>>>>
>>>> Thank you for your comment to the bug 16257.
>>>>
>>>> I am new to LLVM, so not all the aspects of LLVM's /"undef"/ seem 
>>>> clear to me
>>>> yet.
>>>> I read and understood the examples from the LLVM documentation:
>>>> http://llvm.org/docs/LangRef.html#undefined-values
>>>>
>>>> However, those examples do not cover all of the possible contexts 
>>>> where
>>>> /"undef"/ can appear.
>>>> E.g., I can't realize why /"fmul undef, undef"/ may not result 
>>>> in/"undef"/
>>>> whereas /"mul undef, undef"/ is always folded to /"undef"/. Seems I 
>>>> am missing
>>>> something important about floats here.
>>>
>>> in the case of mul undef, undef, the two inputs may be anything. In 
>>> order to
>>> fold the mul to undef you have to prove that the output may be 
>>> anything (any
>>> bit pattern).  For example it would be wrong to fold "mul 0, undef" 
>>> to undef
>>> since no matter what value you substitute for "undef" the output of 
>>> the mul
>>> will always be zero.  Note that in something like "mul undef, 
>>> undef", where
>>> undef occurs more than once, you are allowed to assume that the two 
>>> undefs are
>>> independent of each other, uncorrelated, i.e. in your reasoning you 
>>> can insert
>>> any bit pattern for the first undef and any other bit pattern for 
>>> the second
>>> undef.
>>>
>>> Here is a proof that "mul undef, undef" is undef.  Let R represent an
>>> arbitrary bit pattern.  We have to find two bit patterns X and Y 
>>> such that
>>> "mul X, Y" is equal to R.  For example set X to 1 and Y to R. This 
>>> ends the
>>> proof.
>>>
>>> Now consider "fmul undef, undef".  You could try to apply the same 
>>> reasoning,
>>> and maybe it is OK.  Let R represent an arbitrary bit pattern.  Set 
>>> Y = R and
>>> set X to be the bit pattern for the floating point number 1.0.  Then 
>>> "fmul X,
>>> Y" is equal to R, proving that "fmul undef, undef" can be folded to 
>>> "undef".
>>> But... is this correct?  Is it always true that "fmul 1.0, R" always 
>>> has
>>> exactly the same bits as R (this is not the same as the output 
>>> comparing equal
>>> to R using a floating point comparison)?  After all, R might be 
>>> something
>>> funky like a signed zero, an infinity, or a NaN.  I don't know if it 
>>> is always
>>> true if "fmul 1.0, R" always has the same bits as R, hopefully a 
>>> floating
>>> point expert can tell us.
>>>
>>> You could always argue that you could choose "undef" to be a 
>>> signalling NaN,
>>> causing the program to have undefined behaviour at the point of the
>>> multiplication, in which case you could do anything, but kindly 
>>> instead just
>>> fold the fmul to undef.  I don't much like using snans like this 
>>> though since
>>> they can be made non-signalling by poking the processor AFAIK.
>>>
>>>>
>>>> The same applies to "fdiv".
>>>> /"fdiv undef, undef"/ is not folded but /"div %X, undef"/ and /"div 
>>>> undef, %X"/
>>>> are folded to /"undef"/ (SimplifyFDivInst function in
>>>> lib/Analysis/InstructionSimplify.cpp).
>>>
>>> Here is the argument for "div %X, undef -> undef".  The undef value 
>>> might be
>>> zero (div %X, 0), so lets choose it to be zero. That means that the 
>>> program
>>> has undefined behaviour at this point, and so we could fold the div to
>>> "exit(1)" or "launch nuclear missile".  But instead we only fold it 
>>> to undef.
>>>
>>> You are wrong to say that "div undef, %X" is folded to "undef" by
>>> InstructionSimplify, it is folded to zero.  It would be wrong to 
>>> fold to
>>> undef, since (eg) if %X is 2 then you can't obtain an arbitrary bit 
>>> pattern as
>>> the output.
>>>
>>> Fdiv is harder than div because a floating point division by 0.0 has 
>>> a defined
>>> result, unlike for div.
>>>
>>>  Moreover, SimplifyFDivInst does not take
>>>> into account whether signalling of SNaNs can be switched off or not 
>>>> - it always
>>>> folds if one of the operands is /"undef"/.
>>>
>>> This is either a mistake, or a decision that in LLVM IR snans are 
>>> always
>>> considered to be signalling.
>>>
>>>>
>>>> Another mysterious thing for me is folding of /"mul %X, undef"/.
>>>> The result depends on whether %X is odd or even:
>>>>
>>>>   * "undef" if %X is odd or equal to "undef";
>>>>   * 0 otherwise.
>>>
>>> InstructionSimplify folds "mul %X, undef" to 0 always, since after 
>>> all "undef"
>>> could be zero, in which case the output is 0.  It would be wrong to 
>>> fold it to
>>> undef, as you point out, and it isn't folded to undef.
>>>
>>>>
>>>> There is a similar bug 16258 about folding of /"fadd undef, 
>>>> undef"/. /"Add"
>>>> /gets folded to /"undef"/ if one or both its operands are /"undef"/.
>>>> Should /"fadd"/ behave differently than integer /"add"/ too?
>>>
>>> It's the same problem.  It is easy to show that "add undef, %X" can 
>>> produce
>>> any desired bit pattern as output, but it is less clear whether that 
>>> is true
>>> for fadd, unless you use the snan argument.
>>>
>>> Ciao, Duncan.
>>>
>>>>
>>>> I've tried to google these questions, scanned StackOverflow, but 
>>>> didn't find any
>>>> good articles / posts answering them. LLVM's /"undef"/ is covered 
>>>> quite poorly.
>>>>
>>>> Duncan, do you know of any web resources explaining this in more 
>>>> detail?
>>>>
>>>> Thank you in advance for any help.
>>>>
>>>> Kind regards,
>>>> Oleg
>>>
>>
>