<div dir="ltr">Hi Rafael,<br><div class="gmail_extra"><br><br><div class="gmail_quote">2014-08-29 19:10 GMT+08:00 Rafael Espíndola <span dir="ltr"><<a href="mailto:rafael.espindola@gmail.com" target="_blank">rafael.espindola@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="">On 29 August 2014 05:16, Jiangning Liu <<a href="mailto:liujiangning1@gmail.com">liujiangning1@gmail.com</a>> wrote:<br>

> Hi Rafael and Bob,<br>

><br>

</div><div class="">> The case you gave is really huge! :-)<br>

<br>

</div>Yes, sorry, it is the LTO of clang :-)<br>

<div class=""><br>

> I tried and it turned out it is not a infinite loop, and it can finish in<br>

> ~70 minutes.<br>

><br>

> I tried llc command line option -time-passes, and it shows<br>

><br>

> ==-------------------------------------------------------------------------===<br>

>                       ... Pass execution timing report ...<br>

> ===-------------------------------------------------------------------------===<br>

>   Total Execution Time: 4125.4617 seconds (4124.7082 wall clock)<br>

><br>

>    ---User Time---   --System Time--   --User+System--   ---Wall Time---<br>

> --- Name ---<br>

>   3911.0328 ( 95.1%)   8.5007 ( 65.8%)  3919.5335 ( 95.0%)  3920.7144 (<br>

> 95.1%)  X86 DAG->DAG Instruction Selection<br>

>   47.5946 (  1.2%)   0.6397 (  5.0%)  48.2343 (  1.2%)  48.1823 (  1.2%)<br>

> Greedy Register Allocator<br>

>   16.7073 (  0.4%)   0.0244 (  0.2%)  16.7317 (  0.4%)  16.7890 (  0.4%)<br>

> Simple Register Coalescing<br>

>   11.6154 (  0.3%)   0.0164 (  0.1%)  11.6318 (  0.3%)  11.7178 (  0.3%)<br>

> Machine Instruction Scheduler<br>

>   10.8118 (  0.3%)   0.0677 (  0.5%)  10.8794 (  0.3%)  10.3740 (  0.3%)<br>

> Loop Strength Reduction<br>

><br>

> So the problem is around "X86 DAG->DAG Instruction Selection".<br>

><br>

> I tried to capture "hot" sports using debugger, but I failed, and it seems<br>

> the time is accumulated somewhere.<br>

><br>

> Do you have any suggestions?<br>

<br>

</div>You can try running llvm-extract with every function and then running<br>

llc on the result (which will have only one function). Hopefully you<br>

will find a much smaller testcase that way.<br></blockquote><div><br></div><div>Thanks for your suggestion. I tried this method, and successfully extracted 27041 functions from that huge file. However, I failed to reproduce a small case containing a single function which can reproduce the slowdown. The slowest function I find is _ZN5clang15StmtVisitorBaseINS_8make_ptrENS_13ASTStmtWriterEvE5VisitEPNS_4StmtE.bc, but it can finish in 16 seconds on my x86 box.</div><div><br></div><div>So it seems there are some module passes triggering the slowdown issue...</div><div><br></div><div>Thanks,</div><div>-Jiangning</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<div class=""><br>

> And I'm wondering if this is a x86 specific issue or the slowdown can also<br>

> exposed for other targets like aarch64?<br>

<br>

</div>Hard to tell without a smaller testcase.<br>

<br>

Cheers,<br>

Rafael<br>

</blockquote></div><br></div></div>