[LLVMdev] scalarrepl fails to promote array of vector
Duncan Sands
baldrick at free.fr
Mon Mar 12 01:20:14 PDT 2012
Hi Fan,
> You said that scalarRepl gets shy about loads and stores of the entire
> aggregate. Then I use a test case:
>
> ; ModuleID = 'test1.ll'
> define i32 @fun(i32* nocapture %X, i32 %i) nounwind uwtable readonly {
> %stackArray = alloca <4 x i32>
> %XC = bitcast i32* %X to <4 x i32>*
> %arrayVal = load <4 x i32>* %XC
> store <4 x i32> %arrayVal, <4 x i32>* %stackArray
> %arrayVal1 = load <4 x i32>* %stackArray
> %1 = extractelement <4 x i32> %arrayVal1, i32 1
> ret i32 %1
> }
>
> $ opt -S -stats -scalarrepl test1.ll
> ; ModuleID = 'test1.ll'
>
> define i32 @fun(i32* nocapture %X, i32 %i) nounwind uwtable readonly {
> %XC = bitcast i32* %X to <4 x i32>*
> %arrayVal = load <4 x i32>* %XC
> %1 = extractelement <4 x i32> %arrayVal, i32 1
> ret i32 %1
> }
> ===-------------------------------------------------------------------------===
> ... Statistics Collected ...
> ===-------------------------------------------------------------------------===
>
> 1 mem2reg - Number of alloca's promoted with a single store
> 1 scalarrepl - Number of allocas promoted
>
> You can see that the stackArray is eliminated,
I think you may be confusing arrays and vectors: there is no stack array in
your example, only the vector <4 x i32>. As a general rule hardly any
optimization is done for loads and stores of arrays because front-ends don't
produce them much. Much more effort is made for vectors because they can be
important for getting good performance.
Ciao, Duncan.
although there is loads and
> stores of the entire aggregate.
>
> However, the optimised code is still not optimal. I want the code just load one
> element from X instead of the whole array.
>
> Thanks,
> David
>
>
>
>
>
> On Sun, Mar 11, 2012 at 5:22 AM, Chris Lattner <clattner at apple.com
> <mailto:clattner at apple.com>> wrote:
>
>
> On Mar 10, 2012, at 9:34 AM, Fan Dawei wrote:
>
> > Hi all,
> >
> > I want to use scalarrepl pass to eliminate the allocation of mat_alloc
> which is of type [4 x <4 x float>] in the following program.
> >
> > $cat test.ll
> >
> > ; ModuleID = 'test.ll'
> >
> > define void @main(<4 x float>* %inArg, <4 x float>* %outArg, [4 x <4 x
> float>]* %constants) nounwind {
> > entry:
> > %inArg1 = load <4 x float>* %inArg
> > %mat_alloc = alloca [4 x <4 x float>]
> > %matVal = load [4 x <4 x float>]* %constants
> > store [4 x <4 x float>] %matVal, [4 x <4 x float>]* %mat_alloc
> > %0 = getelementptr inbounds [4 x <4 x float>]* %mat_alloc, i32 0, i32 0
> > %1 = load <4 x float>* %0
> > %2 = fmul <4 x float> %1, %inArg1
> > %3 = getelementptr inbounds [4 x <4 x float>]* %mat_alloc, i32 0, i32 1
> > %4 = load <4 x float>* %3
> > %5 = fmul <4 x float> %4, %inArg1
> > %6 = fadd <4 x float> %2, %5
> > %7 = getelementptr inbounds [4 x <4 x float>]* %mat_alloc, i32 0, i32 2
> > %8 = load <4 x float>* %7
> > %9 = fmul <4 x float> %8, %inArg1
> > %10 = fadd <4 x float> %6, %9
> > %11 = getelementptr inbounds [4 x <4 x float>]* %mat_alloc, i32 0, i32 3
> > %12 = load <4 x float>* %11
> > %13 = fadd <4 x float> %10, %12
> > %14 = getelementptr <4 x float>* %outArg, i32 1
> > store <4 x float> %13, <4 x float>* %14
> > ret void
> > }
> >
> > $ opt -S -stats -scalarrepl test.ll
> >
> > No transformation is performed. I've examined the source code of
> scalarrepl. It seems this pass does not handle array allocations. Is there
> other transformation pass I can use to eliminate this allocation?
>
> Hi David,
>
> ScalarRepl gets shy about loads and stores of the entire aggregate:
>
> > %matVal = load [4 x <4 x float>]* %constants
> > store [4 x <4 x float>] %matVal, [4 x <4 x float>]* %mat_alloc
>
> It is possible to generalize scalarrepl to handle these similar to the way
> it handles memcpy, but noone has done that yet. Also, it's not generally
> recommended to do stuff like this, because you'll get inefficient code from
> many parts of the optimizer and code generator.
>
> -Chris
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list