[llvm-commits] CVS: reopt/docs/ReoptUsersGuide.rtf
Brian Gaeke
gaeke at cs.uiuc.edu
Fri Oct 29 14:50:19 PDT 2004
Changes in directory reopt/docs:
ReoptUsersGuide.rtf updated: 1.4 -> 1.5
---
Log message:
Latest edits, including most of Tanya's suggested edits
---
Diffs of the changes: (+126 -28)
Index: reopt/docs/ReoptUsersGuide.rtf
diff -u reopt/docs/ReoptUsersGuide.rtf:1.4 reopt/docs/ReoptUsersGuide.rtf:1.5
--- reopt/docs/ReoptUsersGuide.rtf:1.4 Fri Oct 15 14:48:02 2004
+++ reopt/docs/ReoptUsersGuide.rtf Fri Oct 29 16:50:08 2004
@@ -1,9 +1,10 @@
{\rtf1\mac\ansicpg10000\cocoartf102
{\fonttbl\f0\fswiss\fcharset77 Helvetica;\f1\fswiss\fcharset77 Helvetica-Bold;\f2\froman\fcharset77 TimesNewRomanMS;
\f3\fmodern\fcharset77 Courier;\f4\fmodern\fcharset77 Courier-Oblique;\f5\fswiss\fcharset77 Helvetica-Oblique;
-\f6\fmodern\fcharset77 Courier-Bold;}
+\f6\fswiss\fcharset77 Helvetica-BoldOblique;\f7\fmodern\fcharset77 Courier-BoldOblique;\f8\fmodern\fcharset77 Courier-Bold;
+}
{\colortbl;\red255\green255\blue255;}
-\margl1440\margr1440\vieww15300\viewh18220\viewkind0
+\margl1440\margr1440\vieww16300\viewh10320\viewkind0
\deftab720
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
@@ -18,6 +19,9 @@
\f0\b0 \cf0 \ulnone \
Introduction to the reoptimizer\
+ Link-time profiling instrumentation\
+ Runtime profiler and trace generator\
+ Trace optimizer\
Building the reoptimizer from source\
Running the reoptimizer on a program in the llvm-test module
\f2 \
@@ -76,13 +80,56 @@
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
\f0\b0 \cf0 \ulnone \
-The reoptimizer is LLVM's dynamic optimization framework. It consists of three basic parts: a collection of compile-time profiling instrumentation passes, a runtime profiler and trace generator, and a runtime trace optimizer.\
+The reoptimizer is LLVM's dynamic optimization framework. It consists of three basic parts: a collection of profiling instrumentation passes, a runtime profiler and trace generator, and a runtime trace optimizer.\
\
-The profiling instrumentation passes run at compile time are as follows: [
-\f1\b TO DO
-\f0\b0 - also need to explain basic terms like FLI, SLI; it would be nice to have a diagram here]\
+\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
+\cf0 \ul Link-time profiling instrumentation\ulnone \
\
-For more information about the reoptimizer's instrumentation passes and runtime profiler, consult Anand Shukla's M.S. thesis, "Lightweight Cross-Procedure Tracing for Runtime Optimization", July 2003.\
+The reoptimizer's profiling instrumentation passes are run at link time by the reopt-llc tool. The exact list of passes that are run, along with the LLVM source files that define them, is as follows:\
+\
+ * Function inlining (
+\f3 lib/Transforms/IPO/InlineSimple.cpp
+\f0 )\
+\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
+\cf0 Performing more inlining is supposed to increase the effectiveness of interprocedural tracing.\
+\
+ * Lower LLVM 'switch' instructions to branches (
+\f3 lib/Transforms/Scalar/LowerSwitch.cpp
+\f0 )\
+\
+ * Lower LLVM 'invoke' instructions to setjmp/longjmp calls (
+\f3 lib/Transforms/Scalar/LowerInvoke.cpp
+\f0 )\
+\
+ * Combine multiple back-edge branches into a single branch (
+\f3 lib/Transforms/Instrumentation/ProfilePaths/CombineBranch.cpp
+\f0 )\
+The CombineBranch pass requires the LowerSwitch and LowerInvoke passes to function correctly.\
+\
+ * Emit a table of pointers to every function (
+\f3 lib/Transforms/Instrumentation/EmitFunctions.cpp
+\f0 )\
+This pass inserts the llvmSimpleFunction, llvmFunctionTable, and llvmFunctionCount symbols into the module. These symbols contain mapping information used by the reoptimizer.\
+\
+ * Instrument back-edge branches of loops (
+\f3 lib/Transforms/Instrumentation/ProfilePaths/InstLoops.cpp
+\f0 )\
+This pass is also known as first-level instrumentation, or "FLI". Each back-edge branch is instrumented with a call to the llvm_first_trigger function, which is defined in
+\f3 reopt/lib/LightWtProfiling/FirstTrigger.cpp
+\f0 .\
+\
+\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
+\cf0 \ul Runtime profiler and trace generator\ulnone \
+\
+\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
+
+\f1\b \cf0 [FIXME - not done.]
+\f0\b0 For more information about the reoptimizer's instrumentation passes and runtime profiler, consult Anand Shukla's M.S. thesis, "Lightweight Cross-Procedure Tracing for Runtime Optimization", July 2003.\
+\
+\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
+\cf0 \ul Trace optimizer\ulnone \
+\
+See the section "How the trace optimizer works", below.\
\
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
@@ -148,8 +195,24 @@
\
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
+\f6\i\b \cf0 You must set your
+\f7 LLVM_REOPT
+\f6 environment variable
+\f0\i0\b0 to contain any command-line options you want to pass to the reoptimizer,
+\f6\i\b before running any tests using the
+\f7 run-tests
+\f6 script
+\f0\i0\b0 . A full list of
+\f3 LLVM_REOPT
+\f0 settings can be found in the section entitled "The
+\f3 LLVM_REOPT
+\f0 environment variable", below.
+\f3 \
+\
+\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
+
\f1\b \cf0 \ul Options recognized by the
-\f6 \ul "run-tests"
+\f8 \ul "run-tests"
\f1 \ul script
\f0\b0 \ulnone \
\
@@ -198,7 +261,15 @@
\f0 \
set to
\f3 '--debug --enable-trace-opt'
-\f0 .\
+\f0 . This is equivalent\
+ to setting your
+\f3 LLVM_REOPT
+\f0 variable to contain\
+
+\f3 '--skip-trace=
+\f4\i N
+\f3\i0 '
+\f0 (see below).\
\
-release
\f3
@@ -274,9 +345,11 @@
\f3 '-ljpeg'
\f0 ) by the program under consideration.\
\
-4. Now, you should set your
-\f3 LLVM_REOPT
-\f0 environment variable to contain any command-line options you want to pass to the reoptimizer. The most important ones are
+4. Now,
+\f6\i\b you must set your
+\f7 LLVM_REOPT
+\f6 environment variable
+\f0\i0\b0 to contain any command-line options you want to pass to the reoptimizer. The most important ones are
\f3 '--debug'
\f0 , to turn on debugging printouts, and
\f3 '--enable-trace-opt'
@@ -291,7 +364,7 @@
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
\f1\b \cf0 \ul The
-\f6 \ul LLVM_REOPT
+\f8 \ul LLVM_REOPT
\f1 \ul environment variable
\f0\b0 \ulnone \
\
@@ -304,19 +377,33 @@
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
\cf0 \ulnone -fli-threshold=
\f5\i count
-\f0\i0 Number of times FLI must trigger before attempting SLI\
+\f0\i0 Number of times FLI must trigger before attempting SLI.\
+ The default is 30.\
-sli-threshold=
\f5\i count
-\f0\i0 Number of iterations of SLI before path counters are sampled\
-\
+\f0\i0 Number of iterations of SLI before path counters are sampled.\
+\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
+\cf0 The default is 50.\
+\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
+\cf0 \
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
\cf0 \ul Phase detection options\
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
-\cf0 \ulnone -enable-phase-detect Use a timer interrupt to remove traces periodically from the trace cache\
+\cf0 \ulnone -enable-phase-detect Use a timer interrupt to remove traces periodically from the trace cache.\
+ The default is
+\f5\i false
+\f0\i0 , i.e., traces are
+\f5\i not
+\f0\i0 periodically removed from the\
+ trace cache.\
-timer-int-s=
\f5\i seconds
-\f0\i0 Interval (in seconds) between phase detection sweeps\
-\
+\f0\i0 Interval (in seconds) between phase detection sweeps.\
+\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
+\cf0 The default is 3.0 seconds. You can provide a decimal fraction\
+ to this option.\
+\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
+\cf0 \
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
\cf0 \ul Trace layout engine options\
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
@@ -332,7 +419,10 @@
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
\cf0 \ul Trace optimizer options\
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
-\cf0 \ulnone -enable-trace-opt Use the new trace optimizer instead of the old trace layout engine\
+\cf0 \ulnone -enable-trace-opt Use the new trace optimizer instead of the old trace layout engine.\
+ If there is a code on which the trace optimizer fails and the trace\
+ layout engine works, or vice-versa, you can use this option to control\
+ which one is used.\
-run-opt-passes Run optimization passes before unpacking TraceFunction\
\
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
@@ -344,24 +434,32 @@
-opt-trace-cache-size=
\f5\i size
\f0\i0 Trace cache size for optimized code\
-\
+
+\f1\b [FIXME: what are the defaults for these? what are the units in which they are specified?]\
+
+\f0\b0 \
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
\cf0 \ul Debugging options\
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
-\cf0 \ulnone -print-machineinstrs Print generated machine code\
+\cf0 \ulnone -debug Turn on debugging printouts from TraceToFunction,\
+ UnpackTraceFunction, and the various other reoptimizer libraries\
+ (in addition to those available from the other options listed below).\
+\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
+\cf0 -print-machineinstrs Print generated machine code for traces.\
-skip-trace=
\f5\i n
\f0\i0 Don't optimize the
\f5\i n
-\f0\i0 th trace, when using trace optimizer\
- -debug Turn on debugging printouts\
+\f0\i0 th trace, when using trace optimizer. If you are\
+ using the run-tests script, you can use its
+\f3 '-skiptrace'
+\f0 option to set\
+ this option automatically; see above.\
-dregalloc=y Turn on SparcV9 register allocator debugging printouts\
-disable-ttf Disable TraceToFunction, UnpackTraceFunction and branch stitching\
-\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
-\cf0 -disable-utf Disable UnpackTraceFunction and branch stitching\
+ -disable-utf Disable UnpackTraceFunction and branch stitching\
-disable-branch-stitch Disable branch stitching\
-\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
-\cf0 \
+\
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
\cf0 \ul Statistics gathering options\
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardeftab720\ql\qnatural
More information about the llvm-commits
mailing list