[llvm-dev] The undef story

Thu Jun 29 16:23:47 PDT 2017

On Thu, Jun 29, 2017 at 11:28 AM, John Regehr via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> On 6/29/17 9:41 AM, Peter Lawrence via llvm-dev wrote:
>
> This doesn’t make sense to me, a shift amount of 48 is “undefined” for
>> unsigned char,
>> How do we know this isn’t a source code bug,
>> What makes us think the the user intended the result to be “0”.
>>
>> This strikes me as odd, we are mis-interpreting the user’s code
>> In such a way so as to improve performance, but that isn’t necessarily
>> what the user intended.
>>
>
> The quoted text above is indicative of a serious misunderstanding and I
> would like to stop it from leading anyone else astray.
>
> The error is in thinking that we should consider the intent of a developer
> when we decide which optimizations to perform. That isn't how this works.
> LLVM code has a mathematical meaning: it describes computations. Any
> transformation that we do is either mathematically correct or it isn't.
>
> A transformation is correct when it refines the meaning of a piece of IR.
> Refinement mostly means "preserves equivalence" but not quite because it
> also allows undefined behaviors to be removed. For example "add nsw" is not
> equivalent to "add" but an "add nsw" can always be turned into an "add".
> The opposite transformation is only permissible when the add can be proven
> to not overflow.
>
> This is like the laws of physics for compiler optimizations, it is not
> open to debate.
>
> The place to consider developer intent, if one wanted to do that, is in
> the frontend that generates IR. If we don't want undef or poison to ever
> happen, then we must make the frontend generate IR that includes
> appropriate checks in front of operations that are sometimes undefined. To
> do this we have sanitizers and safe programming languages.
>
> SUMMARY: The intent, whatever it is, must be translated into IR. The LLVM
> middle end and backends are then obligated to preserve that meaning. They
> generally do this extremely well. But they are not, and must not be,
> obligated to infer the mental state of the developer who wrote the code
> that is being translated.
>

Thanks so much for writing this John.

This is something that I always have to explain to my interns or other
folks that I'm bringing up to speed on compiler development (or on a bad
day, to angry users :P). For some reason, it doesn't seem to be widely
known or written down in very many (any?) places suitable for people new to
the topic. I especially like how you've phrased this as "the laws of
physics for compiler optimizations"; I think I'll be stealing that as it's
a bit more memorable than "fundamental rule of compiler optimizations".

-- Sean Silva

>
> John
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170629/f7a937e2/attachment.html>