[LLVMdev] loads from a null address and optimizations

Sun Sep 6 15:12:27 PDT 2009

On Sep 6, 2009, at 4:01 PM, Török Edwin <edwintorok at gmail.com> wrote:

> On 2009-09-06 20:52, Bill Wendling wrote:
>> The problem he's facing here isn't necessarily one of correctness.
>> He's dealing with undefined behavior (at least in C code). There are
>> no guarantees that the compiler will retain a certain semantic
>> interpretation of an undefined construct between different versions  
>> of
>> the compiler, let alone different optimization levels.
>>
>
> Should LLVM IR inherit all that is undefined behavior in C?

For better or worse, it already inherits some of them. No, I don't  
think the idea is to make LLVM dependent on C's way of doing things.  
But one must assume some base-level of what to do with a particular  
construct.

Apparently, at this time at least, it's considered good to turn a  
dereference of null into unreachable. But like chris mentioned, it's  
something that we should improve.

> That makes it harder to support other languages, or new languages that
> want different semantics
> for things that the C standard defines as undefined.

Yup.

> BTW even for C gcc has -fno-delete-null-pointer-checks, and the Linux
> kernel started using that recently
> by default after all the exploits that mapped NULL to valid memory,  
> and
> took advantage of
> gcc optimizing away the NULL checks.
>
What's the affect of this flag? I've never seen it before. :-) If  
we're doing something that violates the semantics of this flag, then  
it's something we need to fix, of course.

-bw

> On 2009-09-06 23:22, Chris Lattner wrote:
>> On Sep 6, 2009, at 10:52 AM, Bill Wendling wrote:
>>
>>
>>> The problem he's facing here isn't necessarily one of correctness.
>>> He's dealing with undefined behavior (at least in C code). There are
>>> no guarantees that the compiler will retain a certain semantic
>>> interpretation of an undefined construct between different  
>>> versions of
>>> the compiler, let alone different optimization levels.
>>>
>>> From what I understand, he wants a particular behavior from the OS  
>>> (a
>>> signal). The compiler shouldn't have to worry about OS semantics in
>>> the face of undefined language constructs. That being said, if he
>>> wants to implement a couple of passes to change his code, then
>>> sure. :-)
>>>
>>
>> This is something that LLVM isn't currently good at, but that we're
>> actively interested in improving.  Here is some related stuff:
>> http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt
>>
>
> Looks interesting, but it also looks like a lot of work to implement.
> Could instructions have a flag that says whether their semantics is
> C-like (i.e. undefined behavior when you load from null etc.), or
> something else? (throw exception, etc.).
> Optimizations that assume the behavior is undefined should be  
> updated to
> check that flag, and perform the optimization only if the flag is  
> set to
> C-like.
>
> What do you think?
>
>> I don't know of anyone working on this, or planning to work on it in
>> the short term though.
>>
>
>
> Although this is something I'd be interested in having, I lack the  
> time
> to implement it.
>
> Best regards,
> --Edwin