[LLVMdev] Non "folding" Stack Allocation

Wed Aug 17 05:02:34 PDT 2011

Following a question on StackOverflow [1], I was wondering if for big
allocations, LLVM would "delay" the allocation or rather perform it upfront.

The following code was thus submitted to the LLVM Try Out page:

void doSomething(char*,char*);

void function(bool b)
{
    char b1[1 * 1024];
    if( b ) {
       char b2[1 * 1024];
       doSomething(b1, b2);
    } else {
       char b3[512 * 1024];
       doSomething(b1, b3);
    }
}

Certainly nothing spectacular.

I was however quite surprised by the output:

; ModuleID = '/tmp/webcompile/_28066_0.bc'
target datalayout =
"e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-unknown-linux-gnu"

define void @_Z8functionb(i1 zeroext %b) {
entry:
  %b1 = alloca [1024 x i8], align 1               ; <[1024 x i8]*> [#uses=1]
  %b2 = alloca [1024 x i8], align 1               ; <[1024 x i8]*> [#uses=1]
  %b3 = alloca [524288 x i8], align 1            ; <[524288 x i8]*> [#uses=1]
  %arraydecay = getelementptr inbounds [1024 x i8]* %b1, i64 0, i64 0
; <i8*> [#uses=2]
  br i1 %b, label %if.then, label %if.else

if.then:                                          ; preds = %entry
  %arraydecay2 = getelementptr inbounds [1024 x i8]* %b2, i64 0, i64 0
; <i8*> [#uses=1]
  call void @_Z11doSomethingPcS_(i8* %arraydecay, i8* %arraydecay2)
  ret void

if.else:                                          ; preds = %entry
  %arraydecay6 = getelementptr inbounds [524288 x i8]* %b3, i64 0, i64
0 ; <i8*> [#uses=1]
  call void @_Z11doSomethingPcS_(i8* %arraydecay, i8* %arraydecay6)
  ret void
}

declare void @_Z11doSomethingPcS_(i8*, i8*)

(Compiled with "Standard" optimizations as C++ code)

My surprise stems from the fact that Clang/LLVM seems to reserve (at least
in its bytecode) space for all temporary variables, not taking into account
that some are mutually exclusive. I would have expected the space to be *
folded*. However, since this is LLVM IR, and not the final assembly, and
since LLVM IR is strongly typed, it makes sense to keep them separated.

Therefore I was wondering if in the x86 representation (say) these
*would*be folded, and if so what is the name of the
Optimization/CodeGen pass
responsible ?

-- Matthieu

[1]
http://stackoverflow.com/questions/7089035/at-what-moment-is-memory-typically-allocated-for-local-variables-in-c
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110817/ca5cd8c7/attachment.html>