[llvm-dev] The undef story
John Regehr via llvm-dev
llvm-dev at lists.llvm.org
Thu Jun 29 11:28:27 PDT 2017
On 6/29/17 9:41 AM, Peter Lawrence via llvm-dev wrote:
> This doesn’t make sense to me, a shift amount of 48 is “undefined” for
> unsigned char,
> How do we know this isn’t a source code bug,
> What makes us think the the user intended the result to be “0”.
>
> This strikes me as odd, we are mis-interpreting the user’s code
> In such a way so as to improve performance, but that isn’t necessarily
> what the user intended.
The quoted text above is indicative of a serious misunderstanding and I
would like to stop it from leading anyone else astray.
The error is in thinking that we should consider the intent of a
developer when we decide which optimizations to perform. That isn't how
this works. LLVM code has a mathematical meaning: it describes
computations. Any transformation that we do is either mathematically
correct or it isn't.
A transformation is correct when it refines the meaning of a piece of
IR. Refinement mostly means "preserves equivalence" but not quite
because it also allows undefined behaviors to be removed. For example
"add nsw" is not equivalent to "add" but an "add nsw" can always be
turned into an "add". The opposite transformation is only permissible
when the add can be proven to not overflow.
This is like the laws of physics for compiler optimizations, it is not
open to debate.
The place to consider developer intent, if one wanted to do that, is in
the frontend that generates IR. If we don't want undef or poison to ever
happen, then we must make the frontend generate IR that includes
appropriate checks in front of operations that are sometimes undefined.
To do this we have sanitizers and safe programming languages.
SUMMARY: The intent, whatever it is, must be translated into IR. The
LLVM middle end and backends are then obligated to preserve that
meaning. They generally do this extremely well. But they are not, and
must not be, obligated to infer the mental state of the developer who
wrote the code that is being translated.
John
More information about the llvm-dev
mailing list