Rafael Avila de Espindola
rafael.espindola at gmail.com
Fri Sep 5 13:26:27 PDT 2014
Cool. Note that I only hit the bug. I am not current on the selection dag code. Chandler or Jim are probably the best to review the patch.
Sent from my iPhone
> On Sep 5, 2014, at 6:08, Jiangning Liu <liujiangning1 at gmail.com> wrote:
>
> Hi Rafael,
>
> I think I've got a solution to fix this slowdown issue. I personally think the ISEL infrastructure needs to be improved, and CopyValueToVirtualRegisters in SelectionDAGBuilder is called too many times.
>
> Now I change my algorithm by making an early decision before ISEL and store the info into FuncInfo. We can do this because deciding preferred sext/zext doesn't depend on SDNode but LLVM IR. This way, we will be able to calculate the info once and use it many times in real ISEL stage.
>
> I will sent out a patch update later on, and my initial experiment shows that huge case you gave me can finish in 6 minutes now. It's really a good test case to measure compile-time. :-)
>
> Thanks,
> -Jiangning
>
>
>
> 2014-09-05 11:31 GMT+08:00 Jiangning Liu <liujiangning1 at gmail.com>:
>> Hi Rafael,
>>
>> Attached is that test case, but I can't see slowdown with it.
>>
>> Thanks,
>> -Jiangning
>>
>>
>>
>> 2014-09-04 21:55 GMT+08:00 Rafael EspĂndola <rafael.espindola at gmail.com>:
>>> Can you put that testcase somewhere?
>>>
>>> On 4 September 2014 01:19, Jiangning Liu <liujiangning1 at gmail.com> wrote:
>>> > Hi Rafael,
>>> >
>>> >
>>> > 2014-08-29 19:10 GMT+08:00 Rafael EspĂndola <rafael.espindola at gmail.com>:
>>> >
>>> >> On 29 August 2014 05:16, Jiangning Liu <liujiangning1 at gmail.com> wrote:
>>> >> > Hi Rafael and Bob,
>>> >> >
>>> >> > The case you gave is really huge! :-)
>>> >>
>>> >> Yes, sorry, it is the LTO of clang :-)
>>> >>
>>> >> > I tried and it turned out it is not a infinite loop, and it can finish
>>> >> > in
>>> >> > ~70 minutes.
>>> >> >
>>> >> > I tried llc command line option -time-passes, and it shows
>>> >> >
>>> >> >
>>> >> > ==-------------------------------------------------------------------------===
>>> >> > ... Pass execution timing report ...
>>> >> >
>>> >> > ===-------------------------------------------------------------------------===
>>> >> > Total Execution Time: 4125.4617 seconds (4124.7082 wall clock)
>>> >> >
>>> >> > ---User Time--- --System Time-- --User+System-- ---Wall Time---
>>> >> > --- Name ---
>>> >> > 3911.0328 ( 95.1%) 8.5007 ( 65.8%) 3919.5335 ( 95.0%) 3920.7144 (
>>> >> > 95.1%) X86 DAG->DAG Instruction Selection
>>> >> > 47.5946 ( 1.2%) 0.6397 ( 5.0%) 48.2343 ( 1.2%) 48.1823 ( 1.2%)
>>> >> > Greedy Register Allocator
>>> >> > 16.7073 ( 0.4%) 0.0244 ( 0.2%) 16.7317 ( 0.4%) 16.7890 ( 0.4%)
>>> >> > Simple Register Coalescing
>>> >> > 11.6154 ( 0.3%) 0.0164 ( 0.1%) 11.6318 ( 0.3%) 11.7178 ( 0.3%)
>>> >> > Machine Instruction Scheduler
>>> >> > 10.8118 ( 0.3%) 0.0677 ( 0.5%) 10.8794 ( 0.3%) 10.3740 ( 0.3%)
>>> >> > Loop Strength Reduction
>>> >> >
>>> >> > So the problem is around "X86 DAG->DAG Instruction Selection".
>>> >> >
>>> >> > I tried to capture "hot" sports using debugger, but I failed, and it
>>> >> > seems
>>> >> > the time is accumulated somewhere.
>>> >> >
>>> >> > Do you have any suggestions?
>>> >>
>>> >> You can try running llvm-extract with every function and then running
>>> >> llc on the result (which will have only one function). Hopefully you
>>> >> will find a much smaller testcase that way.
>>> >
>>> >
>>> > Thanks for your suggestion. I tried this method, and successfully extracted
>>> > 27041 functions from that huge file. However, I failed to reproduce a small
>>> > case containing a single function which can reproduce the slowdown. The
>>> > slowest function I find is
>>> > _ZN5clang15StmtVisitorBaseINS_8make_ptrENS_13ASTStmtWriterEvE5VisitEPNS_4StmtE.bc,
>>> > but it can finish in 16 seconds on my x86 box.
>>> >
>>> > So it seems there are some module passes triggering the slowdown issue...
>>> >
>>> > Thanks,
>>> > -Jiangning
>>> >
>>> >>
>>> >>
>>> >> > And I'm wondering if this is a x86 specific issue or the slowdown can
>>> >> > also
>>> >> > exposed for other targets like aarch64?
>>> >>
>>> >> Hard to tell without a smaller testcase.
>>> >>
>>> >> Cheers,
>>> >> Rafael
>>> >
>>> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140905/e37cfc75/attachment.html>
More information about the llvm-commits
mailing list