[LLVMdev] MI scheduler produce badly code with inline function
    Zakk 
    zakk0610 at gmail.com
       
    Wed Oct 16 02:15:48 PDT 2013
    
    
  
2013/10/16 Andrew Trick <atrick at apple.com>
>
> On Oct 15, 2013, at 9:28 PM, Zakk <zakk0610 at gmail.com> wrote:
>
> Hi Andy, thanks for your help!!
> The scheduled code by method A is same as B when using the new machine
> model.
> it's make sense, but there is the another problem, the scheduled code is
> badly.
>
> load/store instruction always reuse the same register
>
>
> I filed PR17593 with this information. However, I see opposite results
> from what you’re expecting. The code that uses fewer registers runs 4%
> faster on my cortex-a9. The integer unit is out-of-order.
>
> I think you should use clang to generate .asm, not use clang + llc.
I also reply to http://llvm.org/bugs/show_bug.cgi?id=17593
> this is just because A9's per-operand machine model is not implemented
> well?
> By the way, why do you want to use the new machine model for mi-sched?
>
>
> I want to move all the targets we support to the new machine model so it
> will be easier to maintain the scheduler. Additionally, the new model is
> much more efficient and simpler (if you don’t use special features). It is
> also correct for both preRA and postRA. Note that in the case of A9, the
> .td file for the new machine model is horribly complicated because it
> handles load multiple instructions. The A9 itinerary doesn’t even attempt
> to do that. (This was done mainly to demonstrate the feature set of the new
> model, not because it’s terribly important). The new model for A9 is also
> complicated by a mapping from the old itinerary classes to the new machine
> model.
>
I got it, thanks. writing itinerary class is really tedious...
Kind regards
Kuan-Hsu
> -Andy
>
> Thanks,
>
> Kind regards
> Kuan-Hsu
>
>
>
> 2013/10/15 Andrew Trick <atrick at apple.com>
>
>>
>> On Oct 14, 2013, at 3:27 AM, Zakk <zakk0610 at gmail.com> wrote:
>>
>> Hi all,
>> I meet this problem when compiling the TREAM benchmark (
>> http://www.cs.virginia.edu/stream/FTP/Code/) with enable-misched
>>
>> The small function will be scheduled as good code, but if opt inline this
>> function, the inline part will be scheduled as bad code.
>>
>>
>> A bug for this is welcome. Pretty soon, I’ll be verifying A9 performance
>> and changing the default scheduler. When I do this, I’ll be using the new
>> machine model:
>>
>> (-mllvm) -sched-itins=false
>>
>> However, some scheduler changes are required for that mode to fully
>> enforce pipeline hazards.
>>
>> so I rewrite a simple code as attached link (foo.c), and compiled with
>> two different methods:
>>
>> *method A:*
>> *$clang -O3 foo.c -static -S -o foo.s -mllvm -enable-misched  -mllvm
>> -unroll-count=4 --target=arm -mfloat-abi=hard -mcpu=cortex-a9
>> -fno-vectorize -fno-slp-vectorize*
>> *
>> *
>> *and*
>> *
>> *
>> *method B:*
>> *$clang foo.c -S -emit-llvm -o foo.bc --target=arm -mfloat-abi=hard
>> -mcpu=cortex-a9
>> *
>> *$opt foo.bc -O3 -unroll-count=4 -o foo.opt.bc*
>> * *
>> *$llc foo.opt.bc -o foo.opt.s -march=arm -mcpu=cortex-a9 -enable-misched*
>>
>>
>> You can try “clang -O3 -mllvm -disable-llvm-optzns …”. clang should
>> generate the same bitcode, but skip the “opt” step.
>>
>> If that doesn’t work it can be a nightmare trying to decompose the
>> compilations steps with fidelity. You can try:
>> - clang -### …
>> - clang -mllvm -print-options …
>> - Passing a full triple to all tools with -mtriple
>> - Debug the TargetOptions fields
>> - -print-after-all to see which phase is different
>>
>> Even if you get all the options right, the process of serializing and
>> rereading the IR can affect the optimizations.
>>
>> Sorry. I’ve been trying to think of a way to improve this situation.
>>
>> -Andy
>>
>>  (ps. I had checked with debug-pass=structure, so I think they are
>> equivalently)
>>
>> but the result is different:
>> You can find the LBB1_4 of foo.s, it always reuses the same reg for
>> computation, but LBB1_4 of foo.opt.s doesn't.
>>
>> My question is how to just use clang (method A) to achieve B result?
>> Or i am missing something here?
>>
>> I really appreciate any help and suggestions.
>> Thanks
>>
>> Kuan-Hsu
>>
>> ------- file link -------
>> foo.c: http://goo.gl/nVa2K0
>> foo.s: http://goo.gl/ML9eNj
>> foo.opt.s: http://goo.gl/31PCnf
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>
>
> --
> Best regards,
> Kuan-Hsu
>
>
>
>
-- 
Best regards,
Kuan-Hsu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131016/7c143629/attachment.html>
    
    
More information about the llvm-dev
mailing list