[llvm-commits] [polly] r142746 - in /polly/trunk/www: example_manual_matmul.html examples.html

Tobias Grosser grosser at fim.uni-passau.de
Sun Oct 23 03:57:49 PDT 2011


Author: grosser
Date: Sun Oct 23 05:57:49 2011
New Revision: 142746

URL: http://llvm.org/viewvc/llvm-project?rev=142746&view=rev
Log:
www: Move example into its own section (making room for another one)

Added:
    polly/trunk/www/example_manual_matmul.html
      - copied, changed from r142745, polly/trunk/www/examples.html
Modified:
    polly/trunk/www/examples.html

Copied: polly/trunk/www/example_manual_matmul.html (from r142745, polly/trunk/www/examples.html)
URL: http://llvm.org/viewvc/llvm-project/polly/trunk/www/example_manual_matmul.html?p2=polly/trunk/www/example_manual_matmul.html&p1=polly/trunk/www/examples.html&r1=142745&r2=142746&rev=142746&view=diff
==============================================================================
--- polly/trunk/www/examples.html (original)
+++ polly/trunk/www/example_manual_matmul.html Sun Oct 23 05:57:49 2011
@@ -11,20 +11,16 @@
 <body>
 <!--#include virtual="menu.html.incl"-->
 <div id="content">
-  <!--*********************************************************************-->
-  <h1>Polly: Examples</h1>
-  <!--*********************************************************************-->
 <!--=====================================================================-->
-<h2>Optimize Matrix Multiplication Manually</h2>
+<h1>Polly: Execute the individual Polly passes manually</h1>
 <!--=====================================================================-->
 
-<p>Polly does not yet focus on end user, but on research and the development of
-new optimizations. Hence for the users of Polly it is often necessary to
-understand how Polly works internally. To get an to know the different steps
-taken during polyhedral compilation, we give a step by step example on how to
-use the different Polly passes. For this we optimize a simple matrix
-multiplication kernel. In case you look for a more automated way of executing
-Polly, check out the pollycc tool in utils/pollycc.</p>
+<p>
+This example presents the individual passes that are involved when optimizing
+code with Polly. We show how to execute them individually and explain for each
+which analysis is performed or what transformation is applied. In this example
+the polyhedral transformation is user-provided to show how much performance
+improvement can be expected by an optimal automatic optimizer.</p>
 
 The files used and created in this example are available <a
 href="experiments/matmul">here</a>. They can be created automatically by running

Modified: polly/trunk/www/examples.html
URL: http://llvm.org/viewvc/llvm-project/polly/trunk/www/examples.html?rev=142746&r1=142745&r2=142746&view=diff
==============================================================================
--- polly/trunk/www/examples.html (original)
+++ polly/trunk/www/examples.html Sun Oct 23 05:57:49 2011
@@ -11,448 +11,18 @@
 <body>
 <!--#include virtual="menu.html.incl"-->
 <div id="content">
-  <!--*********************************************************************-->
-  <h1>Polly: Examples</h1>
-  <!--*********************************************************************-->
 <!--=====================================================================-->
-<h2>Optimize Matrix Multiplication Manually</h2>
+<h1>Polly: Examples</h1>
 <!--=====================================================================-->
 
-<p>Polly does not yet focus on end user, but on research and the development of
-new optimizations. Hence for the users of Polly it is often necessary to
-understand how Polly works internally. To get an to know the different steps
-taken during polyhedral compilation, we give a step by step example on how to
-use the different Polly passes. For this we optimize a simple matrix
-multiplication kernel. In case you look for a more automated way of executing
-Polly, check out the pollycc tool in utils/pollycc.</p>
+<h2><a href="example_manual_matmul.html">Execute the individual Polly passes
+manually</a></h2>
 
-The files used and created in this example are available <a
-href="experiments/matmul">here</a>. They can be created automatically by running
-the <a href="experiments/matmul/runall.sh">runall.sh</a> script.
-
-<ol>
-<li><h4>Create LLVM-IR from the C code</h4>
-
-Polly works on LLVM-IR. Hence it is necessary to translate the source files into
-LLVM-IR. If more than on file should be optimized the files can be combined into
-a single file with llvm-link.
-
-<pre class="code">clang -S -emit-llvm matmul.c -o matmul.s</pre>
-</li>
-
-
-<li><h4>Load Polly automatically when calling the 'opt' tool</h4>
-
-Polly is not built into opt or bugpoint, but it is a shared library that needs
-to be loaded into these tools explicitally. The Polly library is called
-LVMPolly.so. For a cmake build it is available in the build/lib/ directory,
-autoconf creates the same file in
-build/tools/polly/{Release+Asserts|Asserts|Debug}/lib. For convenience we create
-an alias that automatically loads Polly if 'opt' is called.
-<pre class="code">
-export PATH_TO_POLLY_LIB="~/polly/build/lib/"
-alias opt="opt -load ${PATH_TO_POLLY_LIB}/LLVMPolly.so"</pre>
-</li>
-
-<li><h4>Prepare the LLVM-IR for Polly</h4>
-
-Polly is only able to work with code that matches a canonical form. To translate
-the LLVM-IR into this form we use a set of canonicalication passes. For this
-example only three passes are necessary. To get good coverage on more
-complicated input files often more canonicalization passes are needed. pollycc
-contains a list of passes that have shown to be beneficial.
-<pre class="code">opt -S -mem2reg -loop-simplify -indvars matmul.s > matmul.preopt.ll</pre></li>
-
-<li><h4>Show the SCoPs detected by Polly (optional)</h4>
-
-To understand if Polly was able to detect SCoPs, we print the
-structure of the detected SCoPs. In our example two SCoPs were detected. One in
-'init_array' the other in 'main'.
-
-<pre class="code">opt -basicaa -polly-cloog -analyze -q matmul.preopt.ll</pre>
-
-<pre>
-init_array():
-for (c2=0;c2<=1023;c2++) {
-  for (c4=0;c4<=1023;c4++) {
-    Stmt_5(c2,c4);
-  }
-}
-
-main():
-for (c2=0;c2<=1023;c2++) {
-  for (c4=0;c4<=1023;c4++) {
-    Stmt_4(c2,c4);
-    for (c6=0;c6<=1023;c6++) {
-      Stmt_6(c2,c4,c6);
-    }
-  }
-}
-</pre>
-</li>
-<li><h4>Highlight the detected SCoPs in the CFGs of the program (requires graphviz/dotty)</h4>
-
-Polly can use graphviz to graphically show a CFG in which the detected SCoPs are
-highlighted. It can also create '.dot' files that can be translated by
-the 'dot' utility into various graphic formats.
-
-<pre class="code">opt -basicaa -view-scops -disable-output matmul.preopt.ll
-opt -basicaa -view-scops-only -disable-output matmul.preopt.ll</pre>
-The output for the different functions<br />
-view-scops:
-<a href="experiments/matmul/scops.main.dot.png">main</a>,
-<a href="experiments/matmul/scops.init_array.dot.png">init_array</a>,
-<a href="experiments/matmul/scops.print_array.dot.png">print_array</a><br />
-view-scops-only:
-<a href="experiments/matmul/scopsonly.main.dot.png">main</a>,
-<a href="experiments/matmul/scopsonly.init_array.dot.png">init_array</a>,
-<a href="experiments/matmul/scopsonly.print_array.dot.png">print_array</a>
-</li>
-
-<li><h4>View the polyhedral representation of the SCoPs</h4>
-<pre class="code">opt -basicaa -polly-scops -analyze matmul.preopt.ll</pre>
-<pre>
-[...]
-Printing analysis 'Polly - Create polyhedral description of Scops' for region:
-'%1 => %17' in function 'init_array':
-   Context:
-   { [] }
-   Statements {
-   	Stmt_5
-           Domain :=
-               { Stmt_5[i0, i1] : i0 >= 0 and i0 <= 1023 and i1 >= 0 and i1 <= 1023 };
-           Scattering :=
-               { Stmt_5[i0, i1] -> scattering[0, i0, 0, i1, 0] };
-           WriteAccess := 
-               { Stmt_5[i0, i1] -> MemRef_A[1037i0 + i1] };
-           WriteAccess := 
-               { Stmt_5[i0, i1] -> MemRef_B[1047i0 + i1] };
-   	FinalRead
-           Domain :=
-               { FinalRead[0] };
-           Scattering :=
-               { FinalRead[i0] -> scattering[200000000, o1, o2, o3, o4] };
-           ReadAccess := 
-               { FinalRead[i0] -> MemRef_A[o0] };
-           ReadAccess := 
-               { FinalRead[i0] -> MemRef_B[o0] };
-   }
-[...]
-Printing analysis 'Polly - Create polyhedral description of Scops' for region:
-'%1 => %17' in function 'main':
-   Context:
-   { [] }
-   Statements {
-   	Stmt_4
-           Domain :=
-               { Stmt_4[i0, i1] : i0 >= 0 and i0 <= 1023 and i1 >= 0 and i1 <= 1023 };
-           Scattering :=
-               { Stmt_4[i0, i1] -> scattering[0, i0, 0, i1, 0, 0, 0] };
-           WriteAccess := 
-               { Stmt_4[i0, i1] -> MemRef_C[1067i0 + i1] };
-   	Stmt_6
-           Domain :=
-               { Stmt_6[i0, i1, i2] : i0 >= 0 and i0 <= 1023 and i1 >= 0 and i1 <= 1023 and i2 >= 0 and i2 <= 1023 };
-           Scattering :=
-               { Stmt_6[i0, i1, i2] -> scattering[0, i0, 0, i1, 1, i2, 0] };
-           ReadAccess := 
-               { Stmt_6[i0, i1, i2] -> MemRef_C[1067i0 + i1] };
-           ReadAccess := 
-               { Stmt_6[i0, i1, i2] -> MemRef_A[1037i0 + i2] };
-           ReadAccess := 
-               { Stmt_6[i0, i1, i2] -> MemRef_B[i1 + 1047i2] };
-           WriteAccess := 
-               { Stmt_6[i0, i1, i2] -> MemRef_C[1067i0 + i1] };
-   	FinalRead
-           Domain :=
-               { FinalRead[0] };
-           Scattering :=
-               { FinalRead[i0] -> scattering[200000000, o1, o2, o3, o4, o5, o6] };
-           ReadAccess := 
-               { FinalRead[i0] -> MemRef_C[o0] };
-           ReadAccess := 
-               { FinalRead[i0] -> MemRef_A[o0] };
-           ReadAccess := 
-               { FinalRead[i0] -> MemRef_B[o0] };
-   }
-[...]
-</pre>
-</li>
-
-<li><h4>Show the dependences for the SCoPs</h4>
-<pre class="code">opt -basicaa -polly-dependences -analyze matmul.preopt.ll</pre>
-<pre>Printing analysis 'Polly - Calculate dependences for SCoP' for region:
-'for.cond => for.end28' in function 'init_array':
-   Must dependences:
-       {  }
-   May dependences:
-       {  }
-   Must no source:
-       {  }
-   May no source:
-       {  }
-Printing analysis 'Polly - Calculate dependences for SCoP' for region:
-'for.cond => for.end48' in function 'main':
-   Must dependences:
-       {  Stmt_4[i0, i1] -> Stmt_6[i0, i1, 0] :
-              i0 >= 0 and i0 <= 1023 and i1 >= 0 and i1 <= 1023;
-          Stmt_6[i0, i1, i2] -> Stmt_6[i0, i1, 1 + i2] :
-              i0 >= 0 and i0 <= 1023 and i1 >= 0 and i1 <= 1023 and i2 >= 0 and i2 <= 1022; 
-          Stmt_6[i0, i1, 1023] -> FinalRead[0] :
-              i1 <= 1091540 - 1067i0 and i1 >= -1067i0 and i1 >= 0 and i1 <= 1023;
-          Stmt_6[1023, i1, 1023] -> FinalRead[0] :
-              i1 >= 0 and i1 <= 1023 
-       }
-   May dependences:
-       {  }
-   Must no source:
-       {  Stmt_6[i0, i1, i2] -> MemRef_A[1037i0 + i2] :
-              i0 >= 0 and i0 <= 1023 and i1 >= 0 and i1 <= 1023 and i2 >= 0 and i2 <= 1023; 
-          Stmt_6[i0, i1, i2] -> MemRef_B[i1 + 1047i2] :
-              i0 >= 0 and i0 <= 1023 and i1 >= 0 and i1 <= 1023 and i2 >= 0 and i2 <= 1023; 
-          FinalRead[0] -> MemRef_A[o0];
-          FinalRead[0] -> MemRef_B[o0]
-          FinalRead[0] -> MemRef_C[o0] :
-              o0 >= 1092565 or (exists (e0 = [(o0)/1067]: o0 <= 1091540 and o0 >= 0
-              and 1067e0 <= -1024 + o0 and 1067e0 >= -1066 + o0)) or o0 <= -1;
-       }
-   May no source:
-       {  }
-</pre></li>
-
-<li><h4>Export jscop files</h4>
-
-Polly can export the polyhedral representation in so called jscop files. Jscop
-files contain the polyhedral representation stored in a JSON file.
-<pre class="code">opt -basicaa -polly-export-jscop matmul.preopt.ll</pre>
-<pre>Writing SCoP 'for.cond => for.end28' in function 'init_array' to './init_array___%for.cond---%for.end28.jscop'.
-Writing SCoP 'for.cond => for.end48' in function 'main' to './main___%for.cond---%for.end48.jscop'.
-</pre></li>
-
-<li><h4>Import the changed jscop files and print the updated SCoP structure
-(optional)</h4>
-<p>Polly can reimport jscop files, in which the schedules of the statements are
-changed. These changed schedules are used to descripe transformations.
-It is possible to import different jscop files by providing the postfix
-of the jscop file that is imported.</p>
-<p> We apply three different transformations on the SCoP in the main function.
-The jscop files describing these transformations are hand written. If PoCC is
-installed Polly can sometimes calculate such schedules fully automatically.
-Hwever, this is still an area we are actively working on.</p>
-
-<h5>No Polly</h5>
-
-<p>As a baseline we do not call any Polly code generation, but only apply the
-normal -O3 optimizations.</p>
-
-<pre class="code">
-opt matmul.preopt.ll -basicaa \
-    -polly-import-jscop \
-    -polly-cloog -analyze
-</pre>
-<pre>
-[...]
-main():
-for (c2=0;c2<g;=1535;c2++) {
-  for (c4=0;c4<g;=1535;c4++) {
-    Stmt_4(c2,c4);
-    for (c6=0;c6<g;=1535;c6++) {
-      Stmt_6(c2,c4,c6);
-    }
-  }
-}
-[...]
-</pre>
-<h5>Interchange (and Fission to allow the interchange)</h5>
-<p>We split the loops and can now apply an interchange of the loop dimensions that
-enumerate Stmt_6.</p>
-<pre class="code">
-opt matmul.preopt.ll -basicaa \
-    -polly-import-jscop -polly-import-jscop-postfix=interchanged \
-    -polly-cloog -analyze
-</pre>
-<pre>
-[...]
-Reading JScop '%1 => %17' in function 'main' from './main___%1---%17.jscop.interchanged'.
-[...]
-main():
-for (c2=0;c2<=1535;c2++) {
-  for (c4=0;c4<=1535;c4++) {
-    Stmt_4(c2,c4);
-  }
-}
-for (c2=0;c2<=1535;c2++) {
-  for (c4=0;c4<=1535;c4++) {
-    for (c6=0;c6<=1535;c6++) {
-      Stmt_6(c2,c6,c4);
-    }
-  }
-}
-[...]
-</pre>
-<h5>Interchange + Tiling</h5>
-<p>In addition to the interchange we tile now the second loop nest.</p>
-
-<pre class="code">
-opt matmul.preopt.ll -basicaa \
-    -polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled \
-    -polly-cloog -analyze
-</pre>
-<pre>
-[...]
-Reading JScop '%1 => %17' in function 'main' from './main___%1---%17.jscop.interchanged+tiled'.
-[...]
-main():
-for (c2=0;c2<=1535;c2++) {
-  for (c4=0;c4<=1535;c4++) {
-    Stmt_4(c2,c4);
-  }
-}
-for (c2=0;c2<=1535;c2+=64) {
-  for (c3=0;c3<=1535;c3+=64) {
-    for (c4=0;c4<=1535;c4+=64) {
-      for (c5=c2;c5<=c2+63;c5++) {
-        for (c6=c4;c6<=c4+63;c6++) {
-          for (c7=c3;c7<=c3+63;c7++) {
-            Stmt_6(c5,c7,c6);
-          }
-        }
-      }
-    }
-  }
-}
-[...]
-</pre>
-<h5>Interchange + Tiling + Strip-mining to prepare vectorization</h5>
-To later allow vectorization we create a so called trivially parallelizable
-loop. It is innermost, parallel and has only four iterations. It can be
-replaced by 4-element SIMD instructions.
-<pre class="code">
-opt matmul.preopt.ll -basicaa \
-    -polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled+vector \
-    -polly-cloog -analyze </pre>
-
-<pre>
-[...]
-Reading JScop '%1 => %17' in function 'main' from './main___%1---%17.jscop.interchanged+tiled+vector'.
-[...]
-main():
-for (c2=0;c2<=1535;c2++) {
-  for (c4=0;c4<=1535;c4++) {
-    Stmt_4(c2,c4);
-  }
-}
-for (c2=0;c2<=1535;c2+=64) {
-  for (c3=0;c3<=1535;c3+=64) {
-    for (c4=0;c4<=1535;c4+=64) {
-      for (c5=c2;c5<=c2+63;c5++) {
-        for (c6=c4;c6<=c4+63;c6++) {
-          for (c7=c3;c7<=c3+63;c7+=4) {
-            for (c8=c7;c8<=c7+3;c8++) {
-              Stmt_6(c5,c8,c6);
-            }
-          }
-        }
-      }
-    }
-  }
-}
-[...]
-</pre>
-
-</li>
-
-<li><h4>Codegenerate the SCoPs</h4>
-<p>
-This generates new code for the SCoPs detected by polly.
-If -polly-import is present, transformations specified in the imported openscop
-files will be applied.</p>
-<pre class="code">opt matmul.preopt.ll | opt -O3 > matmul.normalopt.ll</pre>
-<pre class="code">
-opt -basicaa \
-    -polly-import-jscop -polly-import-jscop-postfix=interchanged \
-    -polly-codegen matmul.preopt.ll \
-   | opt -O3 > matmul.polly.interchanged.ll</pre>
-<pre>
-Reading JScop '%1 => %19' in function 'init_array' from
-    './init_array___%1---%19.jscop.interchanged'.
-File could not be read: No such file or directory
-Reading JScop '%1 => %17' in function 'main' from
-    './main___%1---%17.jscop.interchanged'.
-</pre>
-<pre class="code">
-opt -basicaa \
-    -polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled \
-    -polly-codegen matmul.preopt.ll \
-   | opt -O3 > matmul.polly.interchanged+tiled.ll</pre>
-<pre>
-Reading JScop '%1 => %19' in function 'init_array' from
-    './init_array___%1---%19.jscop.interchanged+tiled'.
-File could not be read: No such file or directory
-Reading JScop '%1 => %17' in function 'main' from
-    './main___%1---%17.jscop.interchanged+tiled'.
-</pre>
-<pre class="code">
-opt -basicaa \
-    -polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled+vector \
-    -polly-codegen -enable-polly-vector matmul.preopt.ll \
-   | opt -O3 > matmul.polly.interchanged+tiled+vector.ll</pre>
-<pre>
-Reading JScop '%1 => %19' in function 'init_array' from
-    './init_array___%1---%19.jscop.interchanged+tiled+vector'.
-File could not be read: No such file or directory
-Reading JScop '%1 => %17' in function 'main' from
-    './main___%1---%17.jscop.interchanged+tiled+vector'.
-</pre>
-<pre class="code">
-opt -basicaa \
-    -polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled+vector \
-    -polly-codegen -enable-polly-vector -enable-polly-openmp matmul.preopt.ll \
-  | opt -O3 > matmul.polly.interchanged+tiled+openmp.ll</pre>
-<pre>
-Reading JScop '%1 => %19' in function 'init_array' from
-    './init_array___%1---%19.jscop.interchanged+tiled+vector'.
-File could not be read: No such file or directory
-Reading JScop '%1 => %17' in function 'main' from
-    './main___%1---%17.jscop.interchanged+tiled+vector'.
-</pre>
-
-<li><h4>Create the executables</h4>
-
-Create one executable optimized with plain -O3 as well as a set of executables
-optimized in different ways with Polly. One changes only the loop structure, the
-other adds tiling, the next adds vectorization and finally we use OpenMP
-parallelism.
-<pre class="code">
-llc matmul.normalopt.ll -o matmul.normalopt.s && \
-    gcc matmul.normalopt.s -o matmul.normalopt.exe
-llc matmul.polly.interchanged.ll -o matmul.polly.interchanged.s && \
-    gcc matmul.polly.interchanged.s -o matmul.polly.interchanged.exe
-llc matmul.polly.interchanged+tiled.ll -o matmul.polly.interchanged+tiled.s && \
-    gcc matmul.polly.interchanged+tiled.s -o matmul.polly.interchanged+tiled.exe
-llc matmul.polly.interchanged+tiled+vector.ll -o matmul.polly.interchanged+tiled+vector.s && \
-    gcc matmul.polly.interchanged+tiled+vector.s -o matmul.polly.interchanged+tiled+vector.exe
-llc matmul.polly.interchanged+tiled+vector+openmp.ll -o matmul.polly.interchanged+tiled+vector+openmp.s && \
-    gcc -lgomp matmul.polly.interchanged+tiled+vector+openmp.s -o matmul.polly.interchanged+tiled+vector+openmp.exe </pre>
-
-<li><h4>Compare the runtime of the executables</h4>
-
-By comparing the runtimes of the different code snippets we see that a simple
-loop interchange gives here the largest performance boost. However by adding
-vectorization and by using OpenMP we can further improve the performance
-significantly.
-<pre class="code">time ./matmul.normalopt.exe</pre>
-<pre>42.68 real, 42.55 user, 0.00 sys</pre>
-<pre class="code">time ./matmul.polly.interchanged.exe</pre>
-<pre>04.33 real, 4.30 user, 0.01 sys</pre>
-<pre class="code">time ./matmul.polly.interchanged+tiled.exe</pre>
-<pre>04.11 real, 4.10 user, 0.00 sys</pre>
-<pre class="code">time ./matmul.polly.interchanged+tiled+vector.exe</pre>
-<pre>01.39 real, 1.36 user, 0.01 sys</pre>
-<pre class="code">time ./matmul.polly.interchanged+tiled+vector+openmp.exe</pre>
-<pre>00.66 real, 2.58 user, 0.02 sys</pre>
-</li>
-</ol>
+This example presents the individual passes that are involved when optimizing
+code with Polly. We show how to execute them individually and explain for each
+which analysis is performed or what transformation is applied. In this example
+the polyhedral transformation is user-provided to show how much performance
+improvement can be expected by an optimal automatic optimizer.
 
 </div>
 </body>





More information about the llvm-commits mailing list