<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Mehdi,<div class=""> I think the following was the point of the conversation,</div><div class="">That both those examples are illegal C programs.</div><div class="">They are both “undefined behavior” because they both</div><div class="">use a shift amount that is too large.</div><div class="">They both should have been rejected by the compiler</div><div class="">even though they weren’t.</div><div class="">Hal agrees wth this assessment,</div><div class="">That’s why we’re waiting for a more complete example.</div><div class=""><br class=""></div><div class="">My belief is that undefined behavior is an optimization hazard,</div><div class="">Not an optimization opportunity. Its just a belief, I could be proved</div><div class="">Wrong at any moment, but it feels right to me.</div><div class=""><br class=""></div><div class="">I would start looking for a more complete example myself, but my</div><div class="">Belief is so strong that "optimizing undefined behavior" seems </div><div class="">like a self-contradiction to me, and I don’t know where to</div><div class="">Even start looking.</div><div class=""><br class=""></div><div class="">I write compiler test programs in my spare time as a hobby,</div><div class="">(which someday I’d like to contribute to llvm)</div><div class="">So it’s not like I don’t have the knowledge or the inclination,</div><div class="">I just don’t know how to approach this problem.</div><div class=""><br class=""></div><div class=""><br class=""></div><div class="">You would think that since “optimization of undefined behavior”</div><div class="">Has become such a bedrock concept in llvm that by now some</div><div class="">Concrete examples would be readily at hand,</div><div class="">But this doesn’t seem to be the case.</div><div class=""><br class=""></div><div class="">So I’m eagerly awaiting Hal’s (or anyone else's) next email</div><div class="">That has a complete example.</div><div class=""><br class=""></div><div class=""><br class=""></div><div class="">Peter Lawrence.</div><div class=""><br class=""></div><div class=""><br class=""><div><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div style="word-wrap:break-word" class=""><div class=""><div class="gmail-h5"><div class=""><blockquote type="cite" class=""><div class=""><blockquote type="cite" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255)" class=""><div class=""><blockquote type="cite" class=""><div class=""><div bgcolor="#FFFFFF" class=""><br class="">I can't comment on SPEC, but this does remind me of code I was working on recently. To abstract the relevant parts, it looked something like this:<br class=""><br class="">template <typename T><br class="">int do_something(T mask, bool cond) {<br class=""> <span class="gmail-m_-3269729547621288564Apple-converted-space"> </span>if (mask & 2)<br class=""> <span class="gmail-m_-3269729547621288564Apple-converted-space"> </span>return 1;<br class=""><br class=""> <span class="gmail-m_-3269729547621288564Apple-converted-space"> </span>if (cond) {<br class=""> <span class="gmail-m_-3269729547621288564Apple-converted-space"> </span>T high_mask = mask >> 48;<br class=""> <span class="gmail-m_-3269729547621288564Apple-converted-space"> </span>if (high_mask > 5)<br class=""> <span class="gmail-m_-3269729547621288564Apple-converted-space"> </span>do_something_1(high_<wbr class="">mask);<br class=""> <span class="gmail-m_-3269729547621288564Apple-converted-space"> </span>else if (high_mask > 3)<br class=""> <span class="gmail-m_-3269729547621288564Apple-converted-space"> </span>do_something_2();<br class=""> <span class="gmail-m_-3269729547621288564Apple-converted-space"> </span>}<br class=""><br class=""> <span class="gmail-m_-3269729547621288564Apple-converted-space"> </span>return 0;<br class="">}<br class=""><br class="">This function ended up being instantiated on different types T (e.g. unsigned char, unsigned int, unsigned long, etc.) and, dynamically, cond was always false when T was char. The question is: Can the compiler eliminate all of the code predicated on cond for the smaller types? In this case, this code was hot, and moreover, performance depended on the fact that, for T = unsigned char, the function was inlined and the branch on cond was eliminated. In the relevant translation unit, however, the compiler would never see how cond was set.<br class=""><br class="">Luckily, we do the right thing here currently. In the case where T = unsigned char, we end up folding both of the high_mask tests as though they were false. That entire part of the code is eliminated, the function is inlined, and everyone is happy.<br class=""><br class="">Why was I looking at this? As it turns out, if the 'else if' in this example is just 'else', we don't actually eliminate both sides of the branch. The same is true for many other variants of the conditionals (i.e. we don't recognize all of the code as dead).</div></div></blockquote><div class=""><div class=""><br class=""></div><div class=""><br class=""></div><div class="">I apologize in advance if I have missed something here and am misreading your example...</div><div class=""><br class=""></div></div><div class="">This doesn’t make sense to me, a shift amount of 48 is “undefined” for unsigned char,</div><div class="">How do we know this isn’t a source code bug,</div><div class="">What makes us think the the user intended the result to be “0”.</div></div></blockquote><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255)" class=""><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);float:none;display:inline" class="">As I said, this is representation of what the real code did, and looked like, after other inlining had taken place, etc. In the original form, the user's intent was clear. That code is never executed when T is a small integer type.</span><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255)" class=""></div></blockquote><br class=""></div><div class=""><br class=""></div></div></div><div class="">I will still have a hard time believing this until I see a real example, can you fill in the details ?</div></div></blockquote><div class=""><br class=""></div><div class=""><br class=""></div><div class="">Hal gave you a real example, have you tried? I feel like you're asking more effort from others than you are ready to put in: it took me less than 5 minutes to reproduce what Hal was describing using his snippet:</div><div class=""><br class=""></div><div class="">See the difference between <a href="https://godbolt.org/g/YYtsxB" class="">https://godbolt.org/g/YYtsxB</a> and <a href="https://godbolt.org/g/dTBBDq" class="">https://godbolt.org/g/dTBBDq</a></div><div class=""><br class=""></div><div class="">-- </div><div class="">Mehdi</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""></div></div></div></div>
</div></blockquote></div><br class=""></div></body></html>