[LLVMdev] optimization assumes malloc return is non-null

Thu May 1 13:01:51 PDT 2008

On May 1, 2008, at 3:30 PM, Sandro Magi wrote:
> Sorry, for j
> On Thu, May 1, 2008 at 1:16 PM, David Vandevoorde
> <daveed at vandevoorde.com> wrote:
>>
>> Another valid implementation of malloc is one that actually returns a
>> non-null pointer in this case, and for such an implementation, a  
>> valid
>> reduction is "int main() { return 1; }".  That reduction is IMO not
>> only valid, but also defensible and maybe desirable.  (And LLVM
>> apparently does so.)
>
> But this reduction depends on knowledge of the malloc implementation,
> which no one has confirmed is actually the case here.

Not really, because the implementation doesn't get called.  Another  
way of thinking about it is as follows: Let __general_malloc be the  
"normal implementation of malloc".  Then the compiler replaces that  
implementation by

	inline void *malloc(size_t n) {
	  if (this_one_special_call_site()) {
	    static byte buf[n] __attribute((most_aligned));
	    return (void*)buf;
	  } else {
	    return __general_malloc(n);
	  }
	}

> On Wed, Apr 30, 2008 at 10:21 PM, Chris Lattner <sabre at nondot.org>  
> wrote:
>> LLVM should not (and does not, afaik) assume the malloc succeeds in
>> general.
>>
>> If LLVM is able to eliminate all users of the malloc assuming the
>> malloc succeeded (as in this case), then it is safe to assume the  
>> malloc
>> returned success.
>
> I don't see how this could be true in general, without either
> knowledge of the malloc implementation, which would be fine, or
> presuming knowledge of the target, which would not be fine. If
> "malloc(sizeof(int))" were changed to "malloc(3245677423)", would it
> still be eliminated?

That's a good question.  More specifically, does the _language_ allow  
the optimization if I ask for so much storage that the address space  
could not possibly accommodate an object of the given size "disjoint  
from any other object" (words of 7.20.3/1)?  My view is that yes this  
is allowed, because the words in 7.20.3/1 only require that for an  
_object_ associated with that storage, which doesn't exist here  
because the address is never used to create a non-opaque pointer.

Here is 7.20.3 for reference:

"The order and contiguity of storage allocated by successive calls to  
the calloc, malloc, and realloc functions is unspeciﬁed. The pointer  
returned if the allocation succeeds is suitably aligned so that it may  
be assigned to a pointer to any type of object and then used to access  
such an object or an array of such objects in the space allocated  
(until the space is explicitly deallocated). The lifetime of an  
allocated object extends from the allocation until the deallocation.  
Each such allocation shall yield a pointer to an object disjoint from  
anyother object. The pointer returned points to the start (lowest byte  
address) of the allocated space. If the space cannot be allocated, a  
null pointer is returned. If the size of the space requested is zero,  
the behavior is implementation-deﬁned: either a null pointer is  
returned, or the behavior is as if the size were some nonzero value,  
except that the returned pointer shall not be used to access an object."

Admittedly, that argument is weaker than for small bounded allocations  
where hypothetical addresses can just be stolen from non-data pages.

	Daveed