[LLVMdev] Call to address 0 gets removed

John McCall rjmccall at apple.com
Wed Jun 10 14:25:50 PDT 2009


On Jun 10, 2009, at 1:18 PM, Nick Lewycky wrote:

> 2009/6/10 John McCall <rjmccall at apple.com>
> There's another point that hasn't been raised yet here, which is that
> the
> undefinedness of calling (void*) 0 is a property of C, not necessarily
> of
> the LLVM abstract language.  I think you can make an excellent case  
> that
> the standard optimizations should not be enforcing C language  
> semantics,
> or at least should allow such optimizations to be disabled.
>
> All sorts of optimizations rely on this, whether as simple as  
> eliminating comparisons of alloca against null to knowing that two  
> malloc'd pointers can never alias (what if malloc returns null? if  
> null is valid then you can store data there...).

I'm not saying we should never make *any* assumptions about null, or  
that C-specific assumptions should be totally unwelcome in standard  
passes.  I'm saying that current practice makes it very difficult to  
avoid certain C-specific assumptions.

Let's take your examples.  The assumption that alloca never produces  
null seems like a reasonable cross-language assumption to me, based on  
alloca's status as a compiler-defined (and totally unstandardized)  
intrinsic;  if I need more rigic semantics, I shouldn't be using  
alloca.  The assumption that the function called malloc never returns  
aliasing pointers is indeed a C-specific assumption, but it's one that  
I can easily avoid if necessary by, well, not using C-specific libcall  
optimizations.  And most of these C-inspired assumptions fall into one  
of those two categories:  it's either generally valid or easily  
disabled.

On the other hand, the assumption that calls to null are undefined  
behavior is so hard-coded into instcombine that I can only avoid it by  
refusing to run the entire instcombine pass, or by carefully guarding  
how I emit calls that might be to null.  And I do think this is  
inappropriate for a core pass, just as if someone made  
BasicAliasAnalysis do type-based alias analysis based on C's strict- 
aliasing rules, or if someone modified a loop-counting pass to use C's  
signed-overflow semantics, or so on.  At the very least, there should  
be some way to configure this on the pass.

> Case in point — calls/loads/stores to null may be undefined behavior
> in C,
> but they're certainly not undefined behavior in (say) Java.  There's a
> well-
> known implementation trick in JVMs where you optimistically emit code
> assuming non-null objects, and then you install signal handlers to  
> raise
> exceptions in the cases where you're wrong.  Now, obviously that trick
> is going to have implications for the optimizers beyond "don't mark  
> null
> stores as unreachable" , but even so, it really shouldn't be totally
> precluded
> by widespread assumptions of C semantics.
>
> The current workaround is to use an alternate address space for your  
> pointers. At some point we may extend the load/store/call  
> instructions to specify their exact semantics similarly to the  
> integer overflow proposal ( http://nondot.org/sabre/LLVMNotes/IntegerOverflow.txt 
>  ).

I'll note that instcombine actually marks stores to null as  
unreachable regardless of the address space of the pointer, unless I'm  
missing something subtle.

John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090610/256588b9/attachment.html>


More information about the llvm-dev mailing list