[LLVMdev] Non "folding" Stack Allocation
Matthieu Monrocq
matthieu.monrocq at gmail.com
Wed Aug 17 05:02:34 PDT 2011
Following a question on StackOverflow [1], I was wondering if for big
allocations, LLVM would "delay" the allocation or rather perform it upfront.
The following code was thus submitted to the LLVM Try Out page:
void doSomething(char*,char*);
void function(bool b)
{
char b1[1 * 1024];
if( b ) {
char b2[1 * 1024];
doSomething(b1, b2);
} else {
char b3[512 * 1024];
doSomething(b1, b3);
}
}
Certainly nothing spectacular.
I was however quite surprised by the output:
; ModuleID = '/tmp/webcompile/_28066_0.bc'
target datalayout =
"e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-unknown-linux-gnu"
define void @_Z8functionb(i1 zeroext %b) {
entry:
%b1 = alloca [1024 x i8], align 1 ; <[1024 x i8]*> [#uses=1]
%b2 = alloca [1024 x i8], align 1 ; <[1024 x i8]*> [#uses=1]
%b3 = alloca [524288 x i8], align 1 ; <[524288 x i8]*> [#uses=1]
%arraydecay = getelementptr inbounds [1024 x i8]* %b1, i64 0, i64 0
; <i8*> [#uses=2]
br i1 %b, label %if.then, label %if.else
if.then: ; preds = %entry
%arraydecay2 = getelementptr inbounds [1024 x i8]* %b2, i64 0, i64 0
; <i8*> [#uses=1]
call void @_Z11doSomethingPcS_(i8* %arraydecay, i8* %arraydecay2)
ret void
if.else: ; preds = %entry
%arraydecay6 = getelementptr inbounds [524288 x i8]* %b3, i64 0, i64
0 ; <i8*> [#uses=1]
call void @_Z11doSomethingPcS_(i8* %arraydecay, i8* %arraydecay6)
ret void
}
declare void @_Z11doSomethingPcS_(i8*, i8*)
(Compiled with "Standard" optimizations as C++ code)
My surprise stems from the fact that Clang/LLVM seems to reserve (at least
in its bytecode) space for all temporary variables, not taking into account
that some are mutually exclusive. I would have expected the space to be *
folded*. However, since this is LLVM IR, and not the final assembly, and
since LLVM IR is strongly typed, it makes sense to keep them separated.
Therefore I was wondering if in the x86 representation (say) these
*would*be folded, and if so what is the name of the
Optimization/CodeGen pass
responsible ?
-- Matthieu
[1]
http://stackoverflow.com/questions/7089035/at-what-moment-is-memory-typically-allocated-for-local-variables-in-c
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110817/ca5cd8c7/attachment.html>
More information about the llvm-dev
mailing list