<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jun 29, 2017 at 11:28 AM, John Regehr via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 6/29/17 9:41 AM, Peter Lawrence via llvm-dev wrote:<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

This doesn’t make sense to me, a shift amount of 48 is “undefined” for<br>

unsigned char,<br>

How do we know this isn’t a source code bug,<br>

What makes us think the the user intended the result to be “0”.<br>

<br>

This strikes me as odd, we are mis-interpreting the user’s code<br>

In such a way so as to improve performance, but that isn’t necessarily<br>

what the user intended.<br>

</blockquote>

<br></span>

The quoted text above is indicative of a serious misunderstanding and I would like to stop it from leading anyone else astray.<br>

<br>

The error is in thinking that we should consider the intent of a developer when we decide which optimizations to perform. That isn't how this works. LLVM code has a mathematical meaning: it describes computations. Any transformation that we do is either mathematically correct or it isn't.<br>

<br>

A transformation is correct when it refines the meaning of a piece of IR. Refinement mostly means "preserves equivalence" but not quite because it also allows undefined behaviors to be removed. For example "add nsw" is not equivalent to "add" but an "add nsw" can always be turned into an "add". The opposite transformation is only permissible when the add can be proven to not overflow.<br>

<br>

This is like the laws of physics for compiler optimizations, it is not open to debate.<br>

<br>

The place to consider developer intent, if one wanted to do that, is in the frontend that generates IR. If we don't want undef or poison to ever happen, then we must make the frontend generate IR that includes appropriate checks in front of operations that are sometimes undefined. To do this we have sanitizers and safe programming languages.<br>

<br>

SUMMARY: The intent, whatever it is, must be translated into IR. The LLVM middle end and backends are then obligated to preserve that meaning. They generally do this extremely well. But they are not, and must not be, obligated to infer the mental state of the developer who wrote the code that is being translated.<br></blockquote><div><br></div><div>Thanks so much for writing this John.</div><div><br></div><div>This is something that I always have to explain to my interns or other folks that I'm bringing up to speed on compiler development (or on a bad day, to angry users :P). For some reason, it doesn't seem to be widely known or written down in very many (any?) places suitable for people new to the topic. I especially like how you've phrased this as "the laws of physics for compiler optimizations"; I think I'll be stealing that as it's a bit more memorable than "fundamental rule of compiler optimizations".</div><div><br></div><div>-- Sean Silva</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

John<div class="HOEnZb"><div class="h5"><br>

______________________________<wbr>_________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

</div></div></blockquote></div><br></div></div>