[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

Mon Jul 29 09:05:21 PDT 2013

On 7/16/2013 11:38 PM, Andrew Trick wrote:
> Since introducing the new TargetTransformInfo analysis, there has been some confusion over the role of target heuristics in IR passes. A few patches have led to interesting discussions.
>
> To centralize the discussion, until we get some documentation and better APIs in place, let me throw out an oversimplified Straw Man for a new pass pipline. It serves two purposes: (1) an overdue reorganization of the pass pipeline (2) a formalization of the role of TargetTransformInfo.
>
> ---
> Canonicalization passes are designed to normalize the IR in order to expose opportunities to subsequent machine independent passes. This simplifies writing machine independent optimizations and improves the quality of the compiler.
>
> An important property of these passes is that they are repeatable. The may be invoked multiple times after inlining and should converge to a canonical form. They should not destructively transform the IR in a way that defeats subsequent analysis.
>
> Canonicalization passes can make use of data layout and are affected by ABI, but are otherwise target independent. Adding target specific hooks to these passes can defeat the purpose of canonical IR.
>
> IR Canonicalization Pipeline:
>
> Function Passes {
>    SimplifyCFG
>    SROA-1
>    EarlyCSE
> }
> Call-Graph SCC Passes {
>    Inline
>    Function Passes {
>      EarlyCSE
>      SimplifyCFG
>      InstCombine
>      Early Loop Opts {
>        LoopSimplify
>        Rotate (when obvious)
>        Full-Unroll (when obvious)
>      }
>      SROA-2
>      InstCombine
>      GVN
>      Reassociate
>      Generic Loop Opts {
>        LICM (Rotate on-demand)
>        Unswitch
>      }
>      SCCP
>      InstCombine
>      JumpThreading
>      CorrelatedValuePropagation
>      AggressiveDCE
>    }
> }
>

I'm a bit late to this, but the examples of the "generic loop opts" 
above are really better left until later.  They have a potential to 
obscure the code and make other loop optimizations harder. 
Specifically, there has to be a place where loop nest optimizations can 
be done (such as loop interchange or unroll-and-jam).  There is also 
array expansion and loop distribution, which can be highly 
target-dependent in terms of their applicability.  I don't know if TTI 
could provide enough details to account for all circumstances that would 
motivate such transformations, but assuming that it could, there still 
needs to be a room left for it in the design.

On a different, but related note---one thing I've asked recently was 
about the "proper" solution for recognizing target specific loop idioms. 
  On Hexagon, we have a builtin functions that handle certain specific 
loop patterns.  In order to separate the target-dependent code from the 
target-independent, we would basically have to replicate the loop idiom 
recognition in our own target-specific pass.  Not only that, but it 
would have to run before the loops may be subjected to other 
optimizations that could obfuscate the opportunity.  To solve this, I 
was thinking about having target-specific hooks in the idiom recognition 
code, that could transform a given loop in the target's own way.  Still, 
that would imply target-specific transformations running before the 
"official" lowering code.

-K

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation