[LLVMdev] loads from a null address and optimizations

Török Edwin edwintorok at gmail.com
Mon Sep 7 08:36:04 PDT 2009


On 2009-09-07 18:29, Chris Lattner wrote:
>
> On Sep 6, 2009, at 2:01 PM, Török Edwin wrote:
>
>> On 2009-09-06 20:52, Bill Wendling wrote:
>>> The problem he's facing here isn't necessarily one of correctness.  
>>> He's dealing with undefined behavior (at least in C code). There are  
>>> no guarantees that the compiler will retain a certain semantic  
>>> interpretation of an undefined construct between different versions of  
>>> the compiler, let alone different optimization levels.
>>>
>>
>> Should LLVM IR inherit all that is undefined behavior in C?
>
> Yes, where it is useful for optimization purposes.
>
>> That makes it harder to support other languages, or new languages that
>> want different semantics
>> for things that the C standard defines as undefined.
>
> This is another question though.  I think that LLVM should support
> taking advantage of undefined behavior in C, but it should also allow
> other languages to model what they need.
>
> As a concrete example, there is no reason not to add a "bit" to
> LoadInst saying whether an "invalid" load is undefined or whether it
> causes an "exception".  The fun part is nailing down which cases of
> "invalid" are allowed, but it isn't that big of a deal.
>
>>>
>>> This is something that LLVM isn't currently good at, but that we're  
>>> actively interested in improving.  Here is some related stuff:
>>> http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt
>>>
>>
>> Looks interesting, but it also looks like a lot of work to implement.
>
> Well that is why it hasn't been done yet :)
>
>> Could instructions have a flag that says whether their semantics is
>> C-like (i.e. undefined behavior when you load from null etc.), or
>> something else? (throw exception, etc.).
>
> Yes.  You need to tell the optimizer what the possible control flow is
> though, or else it will move operations in invalid ways.
>

Another crazy idea: what if we'd model the invalid/undefined behavior
via an llvm.undefinedbehavior intrinsic that has a parameter specifying
the kind of undefined behavior.
Optimizers should then either insert calls to this intrinsic, or do
whatever they do for C currently if TargetData says
llvm.undefinedbehavior should not be preserved.
Languages that need to handle these undefined behaviors could defined
llvm.undefinedbehavior to throw an exception, call runtime function, etc.

This should work even if functions are marked nounwind, since the
unwinder will find the first stackframe that does have a landingpad and
land there, right [*]?
Frontends for languages that want exception for undef behavior could
then use invoke/unwind to. When LLVM will have a better invoke they'll
switch to that of course.

[*] it seems to work for LLVM at least, operator new throws
std::bad_alloc and opt's catch() catches it, although all of llvm is
compiled with no-exceptions.

Best regards,
--Edwin



More information about the llvm-dev mailing list