[llvm-dev] The undef story

Mehdi AMINI via llvm-dev llvm-dev at lists.llvm.org
Thu Jun 29 14:49:16 PDT 2017

2017-06-29 14:32 GMT-07:00 Peter Lawrence <peterl95124 at sbcglobal.net>:

> Mehdi,
>            I think the following was the point of the conversation,
> That both those examples are illegal C programs.
> They are both “undefined behavior” because they both
> use a shift amount that is too large.

Can you confirm that even if the shift isn't executed the program exhibits
undefined behavior?
That wasn't my understanding, so I don't believe this program exhibits UB.

> They both should have been rejected by the compiler
> even though they weren’t.
> Hal agrees wth this assessment,

I'm surprise by the confidence you're exhibiting while speaking for others.


> That’s why we’re waiting for a more complete example.
> My belief is that undefined behavior is an optimization hazard,
> Not an optimization opportunity.  Its just a belief, I could be proved
> Wrong at any moment, but it feels right to me.
> I would start looking for a more complete example myself, but my
> Belief is so strong that "optimizing undefined behavior" seems
> like a self-contradiction to me, and I don’t know where to
> Even start looking.
> I write compiler test programs in my spare time as a hobby,
> (which someday I’d like to contribute to llvm)
> So it’s not like I don’t have the knowledge or the inclination,
> I just don’t know how to approach this problem.
> You would think that since “optimization of undefined behavior”
> Has become such a bedrock concept in llvm that by now some
> Concrete examples would be readily at hand,
> But this doesn’t seem to be the case.
> So I’m eagerly awaiting Hal’s (or anyone else's) next email
> That has a complete example.
> Peter Lawrence.
>> I can't comment on SPEC, but this does remind me of code I was working on
>> recently. To abstract the relevant parts, it looked something like this:
>> template <typename T>
>> int do_something(T mask, bool cond) {
>>   if (mask & 2)
>>     return 1;
>>   if (cond) {
>>     T high_mask = mask >> 48;
>>     if (high_mask > 5)
>>       do_something_1(high_mask);
>>     else if (high_mask > 3)
>>       do_something_2();
>>   }
>>   return 0;
>> }
>> This function ended up being instantiated on different types T (e.g.
>> unsigned char, unsigned int, unsigned long, etc.) and, dynamically, cond
>> was always false when T was char. The question is: Can the compiler
>> eliminate all of the code predicated on cond for the smaller types? In this
>> case, this code was hot, and moreover, performance depended on the fact
>> that, for T = unsigned char, the function was inlined and the branch on
>> cond was eliminated. In the relevant translation unit, however, the
>> compiler would never see how cond was set.
>> Luckily, we do the right thing here currently. In the case where T =
>> unsigned char, we end up folding both of the high_mask tests as though they
>> were false. That entire part of the code is eliminated, the function is
>> inlined, and everyone is happy.
>> Why was I looking at this? As it turns out, if the 'else if' in this
>> example is just 'else', we don't actually eliminate both sides of the
>> branch. The same is true for many other variants of the conditionals (i.e.
>> we don't recognize all of the code as dead).
>> I apologize in advance if I have missed something here and am misreading
>> your example...
>> This doesn’t make sense to me, a shift amount of 48 is “undefined” for
>> unsigned char,
>> How do we know this isn’t a source code bug,
>> What makes us think the the user intended the result to be “0”.
>> As I said, this is representation of what the real code did, and looked
>> like, after other inlining had taken place, etc. In the original form, the
>> user's intent was clear. That code is never executed when T is a small
>> integer type.
>> I will still have a hard time believing this until I see a real example,
>> can you fill in the details ?
> Hal gave you a real example, have you tried? I feel like you're asking
> more effort from others than you are ready to put in: it took me less than
> 5 minutes to reproduce what Hal was describing using his snippet:
> See the difference between https://godbolt.org/g/YYtsxB and
> https://godbolt.org/g/dTBBDq
> --
> Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170629/d64ceb77/attachment.html>

More information about the llvm-dev mailing list