Index: docs/Passes.html =================================================================== --- docs/Passes.html (revision 107585) +++ docs/Passes.html (working copy) @@ -27,7 +27,7 @@ my $o = $order{$1}; $o = "000" unless defined $o; push @x, "$o-$1$2\n"; - push @y, "$o $2\n"; + push @y, "$o -$1: $2\n"; } @x = map { s/^\d\d\d//; $_ } sort @x; @y = map { s/^\d\d\d//; $_ } sort @y; @@ -91,29 +91,46 @@ -dot-postdom-onlyPrint post dominator tree of function to 'dot' file (with no function bodies) -globalsmodref-aaSimple mod/ref analysis for globals -instcountCounts the various types of Instructions +-interprocedural-aa-evalExhaustive Interprocedural Alias Analysis Precision Evaluator +-interprocedural-basic-aaInterprocedural Basic Alias Analysis -intervalsInterval Partition Construction --loopsNatural Loop Construction +-iv-usersInduction Variable Users +-lazy-value-infoLazy Value Information Analysis +-ldaLoop Dependence Analysis +-libcall-aaLibCall Alias Analysis +-lintCheck for common errors in LLVM IR +-live-valuesValue Liveness Analysis +-loopsNatural Loop Information -memdepMemory Dependence Analysis +-module-debuginfoPrints module debug info metadata -no-aaNo Alias Analysis (always returns 'may' alias) -no-profileNo Profile Information +-pointertrackingTrack pointer bounds -postdomfrontierPost-Dominance Frontier Construction -postdomtreePost-Dominator Tree Construction -print-alias-setsAlias Set Printer -print-callgraphPrint a call graph -print-callgraph-sccsPrint SCCs of the Call Graph -print-cfg-sccsPrint SCCs of each function CFG +-print-dbginfoPrint debug info in human readable form +-print-dom-infoDominator Info Printer -print-externalfnconstantsPrint external fn callsites passed constants -print-functionPrint function to stderr -print-modulePrint module to stderr -print-used-typesFind Used Types +-profile-estimatorEstimate profiling information -profile-loaderLoad profile information from llvmprof.out +-profile-verifierVerify profiling information -scalar-evolutionScalar Evolution Analysis +-scev-aaScalarEvolution-based Alias Analysis -targetdataTarget Data Layout TRANSFORM PASSES OptionName +-abcdRemove redundant conditional branches -adceAggressive Dead Code Elimination +-always-inlineInliner for always_inline functions -argpromotionPromote 'by reference' arguments to scalars -block-placementProfile Guided Basic Block Placement -break-crit-edgesBreak critical edges in CFG @@ -125,17 +142,14 @@ -deadtypeelimDead Type Elimination -dieDead Instruction Elimination -dseDead Store Elimination +-functionattrsDeduce function attributes -globaldceDead Global Elimination -globaloptGlobal Variable Optimizer -gvnGlobal Value Numbering --indmemremIndirect Malloc and Free Removal -indvarsCanonicalize Induction Variables -inlineFunction Integration/Inlining --insert-block-profilingInsert instrumentation for block profiling -insert-edge-profilingInsert instrumentation for edge profiling --insert-function-profilingInsert instrumentation for function profiling --insert-null-profiling-rsMeasure profiling framework overhead --insert-rs-profiling-frameworkInsert random sampling instrumentation framework +-insert-optimal-edge-profilingInsert optimal instrumentation for edge profiling -instcombineCombine redundant instructions -internalizeInternalize Global Symbols -ipconstpropInterprocedural constant propagation @@ -152,22 +166,32 @@ -loop-unrollUnroll loops -loop-unswitchUnswitch loops -loopsimplifyCanonicalize natural loops --lowerallocsLower allocations from instructions to calls -lowerinvokeLower invoke and unwind, for unwindless code generators -lowersetjmpLower Set Jump -lowerswitchLower SwitchInst's to branches -mem2regPromote Memory to Register -memcpyoptOptimize use of memcpy and friends +-mergefuncMerge Functions -mergereturnUnify function exit nodes +-partial-inlinerPartial Inliner +-partialspecializationPartial Specialization -prune-ehRemove unused exception handling info -reassociateReassociate expressions -reg2memDemote all values to stack slots -scalarreplScalar Replacement of Aggregates -sccpSparse Conditional Constant Propagation +-sinkCode Sinking -simplify-libcallsSimplify well-known library calls +-simplify-libcalls-halfpowrSimplify half_powr library calls -simplifycfgSimplify the CFG +-split-gepsSplit complex GEPs into simple GEPs +-ssiStatic Single Information Construction +-ssi-everythingStatic Single Information Construction (everything, intended for debugging) -stripStrip all symbols from a module +-strip-dead-debug-infoStrip debug info for unused symbols -strip-dead-prototypesRemove unused function declarations +-strip-debug-declareStrip all llvm.dbg.declare intrinsics +-strip-nondebugStrip all symbols, except dbg symbols, from a module -sretpromotionPromote sret arguments -tailcallelimTail Call Elimination -tailduplicateTail Duplication @@ -177,6 +201,7 @@ OptionName -deadarghaX0rDead Argument Hacking (BUGPOINT USE ONLY; DO NOT USE) -extract-blocksExtract Basic Blocks From Module (for bugpoint use) +-instnamerAssign names to anonymous instructions -preverifyPreliminary module verification -verifyModule Verifier -view-cfgView CFG of function @@ -196,7 +221,7 @@
- Exhaustive Alias Analysis Precision Evaluator + -aa-eval: Exhaustive Alias Analysis Precision Evaluator

This is a simple N^2 alias analysis accuracy evaluator. @@ -210,7 +235,7 @@

- Basic Alias Analysis (default AA impl) + -basicaa: Basic Alias Analysis (default AA impl)

@@ -222,7 +247,7 @@

Yet to be written.

@@ -230,7 +255,7 @@

@@ -242,7 +267,7 @@

@@ -253,7 +278,7 @@

@@ -270,7 +295,7 @@

@@ -281,7 +306,7 @@

@@ -292,7 +317,7 @@

@@ -304,7 +329,7 @@

@@ -316,7 +341,7 @@

@@ -329,7 +354,7 @@

@@ -341,7 +366,7 @@

@@ -355,7 +380,7 @@

@@ -367,7 +392,7 @@

@@ -381,7 +406,7 @@

@@ -394,7 +419,7 @@

@@ -404,9 +429,32 @@

+

This pass implements a simple N^2 alias analysis accuracy evaluator. + Basically, for each function in the program, it simply queries to see how the + alias analysis implementation answers alias queries between each pair of + pointers in the function. +

+
+ + + +
+

This pass defines the default implementation of the Alias Analysis interface + that simply implements a few identities (two different globals cannot alias, + etc), but otherwise does no analysis. +

+
+ + + +

This analysis calculates and represents the interval partition of a function, or a preexisting interval partition. @@ -420,9 +468,82 @@

+

Bookkeeping for "interesting" users of expressions computed from + induction variables.

+
+ + + +
+

Interface for lazy computation of value constraint information.

+
+ + + +
+

Loop dependence analysis framework, which is used to detect dependences in + memory accesses in loops.

+
+ + + +
+

LibCall Alias Analysis.

+
+ + + +
+

This pass statically checks for common and easily-identified constructs + which produce undefined or likely unintended behavior in LLVM IR.

+ +

It is not a guarantee of correctness, in two ways. First, it isn't + comprehensive. There are checks which could be done statically which are + not yet implemented. Some of these are indicated by TODO comments, but + those aren't comprehensive either. Second, many conditions cannot be + checked statically. This pass does no dynamic instrumentation, so it + can't check for all possible problems.

+ +

Another limitation is that it assumes all code will be executed. A store + through a null pointer in a basic block which is never reached is harmless, + but this pass will warn about it anyway.

+ +

Optimization passes may make conditions that this pass checks for more or + less obvious. If an optimization pass appears to be introducing a warning, + it may be that the optimization pass is merely exposing an existing + condition in the code.

+ +

This code may be run before instcombine. In many cases, instcombine checks + for the same kinds of things and turns instructions with undefined behavior + into unreachable (or equivalent). Because of this, this pass makes some + effort to look through bitcasts and so on. +

+
+ + + +
+

LLVM IR Value liveness analysis pass.

+
+ + + +

This analysis is used to identify natural loops and determine the loop depth of various nodes of the CFG. Note that the loops identified may actually be @@ -433,7 +554,7 @@

@@ -446,9 +567,22 @@

+

This pass decodes the debug info metadata in a module and prints in a + (sufficiently-prepared-) human-readable form. + + For example, run this pass from opt along with the -analyze option, and + it'll print to standard output. +

+
+ + + +

Always returns "I don't know" for alias queries. NoAA is unlike other alias analysis implementations, in that it does not chain to a previous analysis. As @@ -458,7 +592,7 @@

@@ -469,9 +603,18 @@

+

Tracking of pointer bounds. +

+
+ + + +

This pass is a simple post-dominator construction algorithm for finding post-dominator frontiers. @@ -480,7 +623,7 @@

@@ -491,7 +634,7 @@

Yet to be written.

@@ -499,7 +642,7 @@

@@ -510,7 +653,7 @@

@@ -521,7 +664,7 @@

@@ -532,9 +675,33 @@

+

Pass that prints instructions, and associated debug info: +

    + +
  • source/line/col information
  • +
  • original variable name
  • +
  • original type name
  • +
+ +

+
+ + + +
+

Dominator Info Printer.

+
+ + + +

This pass, only available in opt, prints out call sites to external functions that are called with constant arguments. This can be @@ -545,7 +712,7 @@

@@ -557,7 +724,7 @@

@@ -567,7 +734,7 @@

@@ -578,9 +745,19 @@

+

Profiling information that estimates the profiling information + in a very crude and unimaginative way. +

+
+ + + +

A concrete implementation of profiling information that loads the information from a profile dump file. @@ -589,9 +766,17 @@

+

Pass that checks profiling information for plausibility.

+
+ + + +

The ScalarEvolution analysis can be used to analyze and catagorize scalar expressions in loops. It specializes in recognizing general @@ -608,9 +793,47 @@

+

Simple alias analysis implemented in terms of ScalarEvolution queries. + + This differs from traditional loop dependence analysis in that it tests + for dependencies within a single iteration of a loop, rather than + dependencies between different iterations. + + ScalarEvolution has a more complete understanding of pointer arithmetic + than BasicAliasAnalysis' collection of ad-hoc analyses. +

+
+ + + +
+

+ performs code stripping. this transformation can delete: +

+ +
    +
  1. names for virtual registers
  2. +
  3. symbols for internal globals and functions
  4. +
  5. debug information
  6. +
+ +

+ note that this transformation makes code much less readable, so it should + only be used in situations where the strip utility would be used, + such as reducing code size or making it harder to reverse engineer code. +

+
+ + + +

Provides other passes access to information on how the size and alignment required by the the target ABI for various data types.

@@ -623,9 +846,24 @@
+

ABCD removes conditional branch instructions that can be proved redundant. + With the SSI representation, each variable has a constraint. By analyzing these + constraints we can prove that a branch is redundant. When a branch is proved + redundant it means that one direction will always be taken; thus, we can change + this branch into an unconditional jump.

+

It is advisable to run SimplifyCFG and + Aggressive Dead Code Elimination after ABCD + to clean up the code.

+
+ + + +

ADCE aggressively tries to eliminate code. This pass is similar to DCE but it assumes that values are dead until proven otherwise. This is similar to SCCP, except applied to @@ -634,9 +872,18 @@

+

A custom inliner that handles only functions that are marked as + "always inline".

+
+ + + +

This pass promotes "by reference" arguments to be "by value" arguments. In practice, this means looking for internal functions that have pointer @@ -665,7 +912,7 @@

This pass is a very simple profile guided basic block placement algorithm. @@ -677,7 +924,7 @@

@@ -690,7 +937,7 @@

This pass munges the code in the input function to better prepare it for @@ -700,7 +947,7 @@

@@ -713,7 +960,7 @@

This file implements constant propagation and merging. It looks for @@ -729,7 +976,7 @@

@@ -741,7 +988,7 @@

@@ -759,7 +1006,7 @@

@@ -771,7 +1018,7 @@

@@ -782,7 +1029,7 @@

@@ -793,9 +1040,24 @@

+

A simple interprocedural pass which walks the call-graph, looking for + functions which do not access or only read non-local memory, and marking them + readnone/readonly. In addition, it marks function arguments (of pointer type) + 'nocapture' if a call to the function does not create any copies of the pointer + value that outlive the call. This more or less means that the pointer is only + dereferenced, and not returned from the function or stored in a global. + This pass is implemented as a bottom-up traversal of the call-graph. +

+
+ + + +

This transform is designed to eliminate unreachable internal globals from the program. It uses an aggressive algorithm, searching out globals that are @@ -807,7 +1069,7 @@

@@ -819,7 +1081,7 @@

@@ -828,30 +1090,12 @@

-

- This pass finds places where memory allocation functions may escape into - indirect land. Some transforms are much easier (aka possible) only if free - or malloc are not called indirectly. -

- -

- Thus find places where the address of memory functions are taken and construct - bounce functions with direct calls of those functions. -

-
- - - -
-

This transformation analyzes and transforms the induction variables (and computations derived from them) into simpler forms suitable for subsequent analysis and transformation. @@ -899,7 +1143,7 @@

@@ -909,29 +1153,10 @@

- This pass instruments the specified program with counters for basic block - profiling, which counts the number of times each basic block executes. This - is the most basic form of profiling, which can tell which blocks are hot, but - cannot reliably detect hot paths through the CFG. -

- -

- Note that this implementation is very naïve. Control equivalent regions of - the CFG should not require duplicate counters, but it does put duplicate - counters in. -

-
- - - -
-

This pass instruments the specified program with counters for edge profiling. Edge profiling can give a reasonable approximation of the hot paths through a program, and is used for a wide variety of program transformations. @@ -946,54 +1171,21 @@

-

- This pass instruments the specified program with counters for function - profiling, which counts the number of times each function is called. +

This pass instruments the specified program with counters for edge profiling. + Edge profiling can give a reasonable approximation of the hot paths through a + program, and is used for a wide variety of program transformations.

- The basic profiler that does nothing. It is the default profiler and thus - terminates RSProfiler chains. It is useful for measuring - framework overhead. -

-
- - - -
-

- The second stage of the random-sampling instrumentation framework, duplicates - all instructions in a function, ignoring the profiling code, then connects the - two versions together at the entry and at backedges. At each connection point - a choice is made as to whether to jump to the profiled code (take a sample) or - execute the unprofiled code. -

- -

- After this pass, it is highly recommended to runmem2reg - and adce. instcombine, - load-vn, gdce, and - dse also are good to run afterwards. -

-
- - - -
-

Combine instructions to form fewer, simple instructions. This pass does not modify the CFG This pass is where algebraic simplification happens. @@ -1044,7 +1236,7 @@

@@ -1056,7 +1248,7 @@

@@ -1070,7 +1262,7 @@

@@ -1081,7 +1273,7 @@

@@ -1110,7 +1302,7 @@

@@ -1139,7 +1331,7 @@

@@ -1175,7 +1367,7 @@

@@ -1188,7 +1380,7 @@

@@ -1201,7 +1393,7 @@

@@ -1213,7 +1405,7 @@

@@ -1224,7 +1416,7 @@

@@ -1238,7 +1430,7 @@

A simple loop rotation transformation.

@@ -1246,7 +1438,7 @@

@@ -1258,7 +1450,7 @@

@@ -1288,7 +1480,7 @@

@@ -1329,7 +1521,7 @@

@@ -1345,7 +1537,7 @@

@@ -1386,7 +1578,7 @@

@@ -1415,7 +1607,7 @@

@@ -1427,7 +1619,7 @@

@@ -1443,7 +1635,7 @@

@@ -1454,9 +1646,30 @@

+

This pass looks for equivalent functions that are mergable and folds them. + + A hash is computed from the function, based on its type and number of + basic blocks. + + Once all hashes are computed, we perform an expensive equality comparison + on each function pair. This takes n^2/2 comparisons per bucket, so it's + important that the hash function be high quality. The equality comparison + iterates through each instruction in each basic block. + + When a match is found the functions are folded. If both functions are + overridable, we move the functionality into a new internal function and + leave two overridable thunks to it. +

+
+ + + +

Ensure that functions have at most one ret instruction in them. Additionally, it keeps track of which node is the new exit node of the CFG. @@ -1465,9 +1678,35 @@

+

This pass performs partial inlining, typically by inlining an if + statement that surrounds the body of the function. +

+
+ + + +
+

This pass finds function arguments that are often a common constant and + specializes a version of the called function for that constant. + + This pass simply does the cloning for functions it specializes. It depends + on IPSCCP and DAE to clean up the results. + + The initial heuristic favors constant arguments that are used in control + flow. +

+
+ + + +

This file implements a simple interprocedural pass which walks the call-graph, turning invoke instructions into call instructions if and @@ -1478,7 +1717,7 @@

@@ -1501,7 +1740,7 @@

@@ -1518,7 +1757,7 @@

@@ -1540,7 +1779,7 @@

@@ -1563,9 +1802,19 @@

+

This pass moves instructions into successor blocks, when possible, so that + they aren't executed on paths where their results aren't needed. +

+
+ + + +

Applies a variety of small optimizations for calls to specific well-known function calls (e.g. runtime library functions). For example, a call @@ -1576,9 +1825,19 @@

+

Simple pass that applies an experimental transformation on calls + to specific functions. +

+
+ + + +

Performs dead code elimination and basic block merging. Specifically:

@@ -1595,11 +1854,48 @@
+

This function breaks GEPs with more than 2 non-zero operands into smaller + GEPs each with no more than 2 non-zero operands. This exposes redundancy + between GEPs with common initial operand sequences. +

+
+ + + +
+

This pass converts a list of variables to the Static Single Information + form. + + We are building an on-demand representation, that is, we do not convert + every single variable in the target function to SSI form. Rather, we receive + a list of target variables that must be converted. We also do not + completely convert a target variable to the SSI format. Instead, we only + change the variable in the points where new information can be attached + to its live range, that is, at branch points. +

+
+ + + +
+

A pass that runs SSI on every non-void variable, intended for debugging. +

+
+ + + +

- Performs code stripping. This transformation can delete: + performs code stripping. this transformation can delete:

    @@ -1609,7 +1905,7 @@

- Note that this transformation makes code much less readable, so it should + note that this transformation makes code much less readable, so it should only be used in situations where the strip utility would be used, such as reducing code size or making it harder to reverse engineer code.

@@ -1617,7 +1913,7 @@

@@ -1630,9 +1926,43 @@

+

This pass implements code stripping. Specifically, it can delete: +

    +
  • names for virtual registers
  • +
  • symbols for internal globals and functions
  • +
  • debug information
  • +
+ Note that this transformation makes code much less readable, so it should + only be used in situations where the 'strip' utility would be used, such as + reducing code size or making it harder to reverse engineer code. +

+
+ + + +
+

This pass implements code stripping. Specifically, it can delete: +

    +
  • names for virtual registers
  • +
  • symbols for internal globals and functions
  • +
  • debug information
  • +
+ Note that this transformation makes code much less readable, so it should + only be used in situations where the 'strip' utility would be used, such as + reducing code size or making it harder to reverse engineer code. +

+
+ + + +

This pass finds functions that return a struct (using a pointer to the struct as the first argument of the function, marked with the 'sret' attribute) and @@ -1653,7 +1983,7 @@

@@ -1685,7 +2015,7 @@

@@ -1705,7 +2035,7 @@