[LLVMdev] How to make Polly ignore some non-affine memory accesses
Tobias Grosser
tobias at grosser.es
Fri Oct 7 18:38:52 PDT 2011
On 10/07/2011 03:43 PM, Marcello Maggioni wrote:
> 2011/10/7 Marcello Maggioni<hayarms at gmail.com>:
>> Hi,
>>
>> for example this loop:
>>
>> #include<stdio.h>
>>
>> int main()
>> {
>> int A[1024];
>> int j, k=10;
>> for (j = 1; j< 1024; j++)
>> A[j] = k;
>>
>> return 0;
>> }
>>
>> run with:
>>
>> #!/bin/bash
>> source ../set_path.source
>> clang -S -emit-llvm $1.c -o $1.s
>> opt -S -mem2reg -loop-simplify -indvars $1.s> $1.preopt.ll
>> opt -load ${PATH_TO_POLLY_LIB}/LLVMPolly.so -polly-detect -analyze $1.preopt.ll
>>
>> Using the instructions found on the Polly website.
There are two reasons why it does not work with these instructions
1. Not the right preoptimization passes
The instructions on the Polly website use a very short sequence of
preoptimization passes that just matches the simple example presented.
To optimize real world programs that are not 100% standard you need a
larger sequence of prepasses. In your example the loop iv does not start
at '0' so further canonicalication is required.
The minimal sequence of optimization passes for your example is:
opt -mem2reg -instcombine -simplifycfg -loop-rotate -indvars
2. -enable-iv-rewrite
LLVM recently disabled by default the induction variable rewrite. This
feature is necessary for the Polly preoptimizations. Hence, you need to
reenable it with the above flag during preoptimization.
-> opt -mem2reg -instcombine -simplifycfg -loop-rotate -indvars
-enable-iv-rewrite test.s -S > test.preopt.ll
In general I use an even longer sequence of preoptimizations. This
sequence is built directly into Polly and is defined in
RegisterPasses.cpp. To get the corresponding opt flags you can load
Polly into clang and use the following command (-O3 is important):
> clang test.c -O3 -Xclang -load -Xclang lib/LLVMPolly.so -mllvm
-debug-pass=Arguments
Pass Arguments: -targetdata -no-aa -targetlibinfo -tbaa -basicaa
-preverify -domtree -verify -mem2reg -instcombine -simplifycfg
-tailcallelim -simplifycfg -reassociate -domtree -loops -loop-simplify
-lcssa -loop-rotate -instcombine -scalar-evolution -loop-simplify -lcssa
-indvars -polly-prepare -postdomtree -domfrontier -regions
-polly-region-simplify -scalar-evolution -loop-simplify -lcssa -indvars
-postdomtree -domfrontier -regions -polly-detect -polly-independent
-polly-analyze-ir -polly-scops -polly-dependences -polly-optimize-isl
-polly-cloog -polly-codegen -simplifycfg -domtree -scalarrepl -early-cse
-lower-expect
If you remove the first pass (-targetdata) and the passes after
-polly-detect you get the list of preparing transformations (and the
analysis they need).
Today I also further improved clang support. With the patch: 'Fix
parsing of command line options for LLVM plugins' as posted today on
cfe-commits, you can get the following new features:
1. Optimize a .c file automatically
Run: clang test.c -O3 -Xclang -load -Xclang lib/LLVMPolly.so -mllvm
-enable-polly-viewer -mllvm -enable-iv-rewrite
Polly + all the preparing transformations are automatically scheduled.
2. Graphical SCoP viewer
Run: clang test.c -O3 -Xclang -load -Xclang lib/LLVMPolly.so -mllvm
-enable-polly-viewer -mllvm -enable-iv-rewrite
Show for every function a graphviz graph that highlights the detected
SCoPs. Here we also show for every non-scop region, the reason why it
is not a SCoP. (This needs graphviz installed when run LLVM configure is
run. You may also consider to install xdot.py [1] as a more convenient
.dot file viewer. xdot.py will used automatically, if it is
in the PATH during 'configure' of LLVM)
3. OpenMP:
RUN: clang test.c -O3 -Xclang -load -Xclang lib/LLVMPolly.so -mllvm
-enable-polly-openmp -lgomp
This will automatically OpenMP parallelize your code.
(Remember for all this LLVM/Clang/Polly need to be built together such
that they are in sync and .so loading works)
Cheers
Tobi
[1] http://code.google.com/p/jrfonseca/wiki/XDot
More information about the llvm-dev
mailing list