[LLVMdev] algebraic (de)optimizations form long chains of dependent operations
Bjorn De Sutter
bjorn.desutter at elis.ugent.be
Mon Nov 14 08:30:44 PST 2011
Hi,
when I compile (clang -O3) and optimize (opt -O3) C-code like this:
sum1 = val1 + val2;
sum2 = val3 + val4;
sum3 = val5 + val6;
sum4 = val7 + val8;
sum5 = sum1 + sum2;
sum6 = sum3 + sum4;
sum7 = sum5 + sum6;
sum += sum7;
I get bitcode like this:
if.end152: ; preds = %if.then150, %if.else146, %if.end137
%val8.0 = phi i32 [ -2048, %if.then150 ], [ %conv38, %if.else146 ], [ 2047, %if.end137 ]
%conv154 = trunc i32 %val8.0 to i16
store i16 %conv154, i16* %add.ptr36, align 2
%add = add i32 %val1.0, %sum.1
%add161 = add i32 %add, %val2.0
%add170 = add i32 %add161, %val3.0
%add167 = add i32 %add170, %val4.0
%add173 = add i32 %add167, %val5.0
%add164 = add i32 %add173, %val6.0
%add176 = add i32 %add164, %val7.0
%add179 = add i32 %add176, %val8.0
%cmp181 = icmp eq i32 %rem, 56
br i1 %cmp181, label %if.then183, label %for.inc
The problem with this bitcode is that all additions form one long chain of dependent additions, rather than a tree of additions as in the original source code. For parallel architectures like VLIW architectures, this is detrimental for performance. Any idea how I can avoid this?
Thanks,
prof. Bjorn De Sutter
Computer Systems Lab
Ghent University
More information about the llvm-dev
mailing list