[LLVMdev] Promoting malloc to alloca

Mon Jul 12 23:00:00 PDT 2010

Matt Giuca wrote:
> OK thanks for the replies.
>
> Yes, I was planning to use a garbage collector. This is for a
> functional language, so there's no real way to determine when memory
> needs to be freed without one.
>
>> Firstly, the pointer has to not make it into any function call at all, since any function might in turn call free(). Then we need to do escape analysis as you pointed out, but that's not difficult. We do similar analysis to determine pointer capture already.
>
> OK. Is there any way to access the results of this pointer capture
> analysis (eg. have LLVM tell me which pointers will escape and which
> won't). That way I could manually change the mallocs into allocas.

That depends what you mean by "escape". The LLVM concept of 'capture' 
means, "did the pointer outlive the function call?" We use this in an 
interprocedural context, for example, you can call strlen() with a 
pointer and rest assured that no copies of the pointer were made.

While this is useless for malloc elimination in the general case (you 
care whether the callee might call free() not just whether it makes a 
copy) in your case with no explicit frees I think it's probably 
equivalent. The API is in include/llvm/Analysis/CaptureTracking.h, and 
link to libAnalysis.

> Now I can see why LLVM doesn't do this -- it would be unsafe in
> general. In my language (and many other high-level garbage-collected
> languages which don't generate free() calls), it would be useful, but
> LLVM itself can't guarantee that it's safe.
>
>>>
>>> define i32 @dontescape(i32 %x) {
>>> entry:
>>>    %t = tail call i8* @malloc(i32 4)
>>>    %t1 = bitcast i8* %t to i32*
>>>    store i32 %x, i32* %t1
>>>    %t2 = load i32* %t1
>>>    ret i32 %t2
>>> }
>>
>> When do you free it? Why not just use an alloca?
>
> Well I can't use an alloca because the code is generated by a compiler
> which just knows that it has to allocate memory, and expects it to be
> freed by a garbage collector. I would like to generate an alloca
> instead of a malloc in some situations, such as this, so the purpose
> of this email was to work out whether I can get LLVM to promote it for
> me, or whether I have to do the analysis myself.
>
>> LLVM doesn't realize that @malloc has no other side-effects. It's treating it like a function that might perform I/O.
>
> Right, which is why I was trying to change its annotations to show
> LLVM that @malloc doesn't have I/O side-effects, so it can be removed
> if its result is unused.
>
>>> So firstly, is there anything I'm doing wrong which is preventing this
>>> from working? For example, I tried declaring malloc as "readnone" (is
>>> this safe?).
>>
>> No. If it's readnone then two calls with the same argument can be folded to one.
>
> Good point. Is there any annotation which says LLVM can remove the
> call if the result is unused (i.e., there are no side-effects), but it
> can't combine two calls into one (i.e., the result might be different
> each time). That seems like what "readonly" is for, but I thought I'd
> check.
>
> In other words, is it safe to annotate malloc as "readonly"?

Again, no. Sorry! LLVM is smart enough to know that if it sees two 
readonly calls with the same arguments -- and there's nothing in between 
that could write to memory -- then they can once again be folded as the 
calls must return the same thing.

The attribute you're looking for, "delete if result is unused" doesn't 
exist in LLVM. I've considered it in the past, but the truth is that 
most of the time I want to eliminate dead malloc's, they *do* have a 
use: the matching free. At some point I expect I'm going to teach LLVM 
to remove dead malloc+free / new+delete / new[]+delete[] pairings, but I 
haven't gotten around to it yet. Teaching it to kill an allocation with 
no uses at all will be dead simple too.

>> Doing this in general requires either that the free can't be found or that it really is right before the function exit, or else there might be a good reason the memory is on the heap. Then we have to make sure that we don't transform malloc calls inside loops (unless we also prove they're independent and Just Leaked).
>>
>> How large a block of memory are you willing to move from heap to stack? What if it's a variable, how hard should we work to try to determine its upper bound (we probably can't with any reasonable amount of effort). And what type is the alloca? We could take the 4 from malloc(i32 4) and create an alloca i8, 4 and let instcombine try to sort it out.
>>
>> Where should we look for malloc calls? Note that we only do stack to register transformation on alloca's in the first basic block if they're stacked together in a row as the very first instructions in the function.
>
> Thanks, a useful summary of the problems.
>
> It sounds like this analysis is best implemented in a higher-level
> compiler, before generating LLVM code.

You could try implementing it as an LLVM pass which is specific to your 
language. That way you would benefit from any effect where LLVM has 
managed to optimize away the uses of a malloc to let you remove the 
malloc later on.

Nick