<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">On Jun 24, 2013, at 3:09 PM, Chandler Carruth <<a href="mailto:chandlerc@gmail.com">chandlerc@gmail.com</a>> wrote:<br><div>On Mon, Jun 24, 2013 at 2:59 PM, Nadav Rotem <span dir="ltr"><<a href="mailto:nrotem@apple.com" target="_blank" class="cremed">nrotem@apple.com</a>></span> wrote:<br><blockquote type="cite"><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; position: static; z-index: auto;">I agree. The vectorizer is a *lowering* pass, and much like LSR and it loses information.  A few months ago some of us talked about this and came up with a general draft for the ideal pass ordering. </blockquote><div><br></div><div>Where? On the mailing list?</div></div></div></div></blockquote><div dir="auto"><br></div><div dir="auto">These discussions had more to do with formalizing the use of target information within IR passes (legality, instruction-level cost). There were some list threads and offline discussion. I set this aside because I wasn’t sure how it was going to fit with some of the other work in progress, particularly LTO. </div><div dir="auto"><br></div><div dir="auto">I don’t think there’s any controversy over the high-level goals. But there will be controversy when we start proposing concrete pass ordering changes. </div><div><br></div>When I return to work mid-July, I’d be happy to send out some proposed changes for discussion. The first step will be an improved interface for IR-level cost metrics, which we already agreed to some time ago.</div><div><br><blockquote type="cite"><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; position: static; z-index: auto;">If I remember correctly the plan was that the second half of the pipe should start with GVN (which currently runs after the loop passes). After that come the loop passes, the Vectorizers (loop vectorization first), and finally LSR, Lower-switch, CGP, etc.  I think that when we discussed this most people argued that the inliner should be before GVN and the loop passes. It would be interesting to see the performance numbers for the new pass order.  </blockquote></div><br>This doesn't make a lot of sense to me yet.</div></div></blockquote><blockquote type="cite"><div dir="ltr" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><div class="gmail_extra">The inliner, GVN, and the loop passes run together, *iteratively*. They are neither before or after one another. And this is important as it allows iterative simplification in the inliner. It is one of the most critical optimizations for C++ code that LLVM does.</div><div class="gmail_extra"><br></div><div class="gmail_extra">We can't sink all of the loop passes out of the iterative pass model either, because deleting loops, simplifying them, etc. all directly feed the iterative simplification needed by GVN and the inliner.</div><div class="gmail_extra"><br></div><div class="gmail_extra">We need a *second* loop pass that happens after the iterative CGSCC walk which does the further optimizations such as (potentially indvars, ) the vectorizers, LSR, lower-switch, CGP, CG. I think we actually want most of the post CGSCC module passes to run after the vectorizers and before LSR to fold away constants and globals that look different after vectorization compared to before, but aren't significantly shifted by LSR and CGP.</div></div></blockquote><br></div><div><div>I don't want to start a centi-thread yet, but here's a very rough idea (leaving many things out):</div><div><br></div><div>Canonicalize {</div><div>  Func {</div><div>    SimpCFG</div><div>    SROA-1</div><div>    EarlyCSE</div><div>  }</div><div>  CGSCC {</div><div>    Inline</div><div>    EarlyCSE</div><div>    SimpCFG</div><div>    InstCombine</div><div>    Early Loop Opts {</div><div>      LoopSimplify</div><div>      Rotate</div><div>      Obvious-Full-Unroll</div><div>    }</div><div>    SROA-2</div><div>    InstCombine</div><div>    GVN</div><div>    Reassociate</div><div>    Late Loop Opts {</div><div>      LICM</div><div>      Unswitch</div><div>    }</div><div>    SCCP</div><div>    InstCombine</div><div>    JT</div><div>    CVP</div><div>    DCE</div><div>  }</div><div>}</div><div>Lower {</div><div>  Target Loop Opts {</div><div>    IndvarSimplify</div><div>    Vectorize/Unroll</div><div>    LSR</div><div>  }</div><div>  SLP Vectorize</div><div>}</div><div><br></div><div>We might need to pull some things like exit value replacement out of IndvarSimplify into target-independent loop opts.</div><div><br></div><div>-Andy</div></div><br></body></html>