[PATCH] D28593: Update loop branch_weight metadata after loop rotation.
Xin Tong via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 13 10:52:09 PST 2017
I am seeing them very different now. Did you try it with and without
the patch ? I am seeing we peel 2X without metadata adjustment and
once with metadata adjustment .
Thanks,
-Xin
On Fri, Jan 13, 2017 at 10:47 AM, Dehao Chen <danielcdh at gmail.com> wrote:
> Thanks!
>
> I tried the new test case, the assembly is now different, but still seem
> equivalent:
>
> #diff loop-rotate.s ../new/loop-rotate.s
> 47c47
> < .loc 1 0 3 is_stmt 0 discriminator 1 # loop-rotate.c:0:3
> ---
>> .loc 1 16 13 # loop-rotate.c:16:13
> 51d50
> < .loc 1 16 13 is_stmt 1 # loop-rotate.c:16:13
> 103a103
>> xorl %ebp, %ebp
> 106d105
> < xorl %ebp, %ebp
> 193c192
> < .asciz "/ssd/llvm/build.gccopt/1" # string offset=69
> ---
>> .asciz "/ssd/llvm/build.gccopt/2" # string offset=69
>
> Dehao
>
>
> On Fri, Jan 13, 2017 at 10:38 AM, Xin Tong <trent.xin.tong at gmail.com> wrote:
>>
>> Dehao. I am putting up a new example, the command line to reproduce is
>> the same. The example before suffers from block merging in
>> BranchFolding in downstream in llc. Because there is no
>> simplifications done to peeled block. With this example, we will see
>> that we overpeeled the loop due to not updating the metadata properly
>> after rotation.
>>
>> Thanks
>> -Xin
>>
>> // looprotate.c
>> const int looparr[4] = {1,2,3,4};
>> int g = 0;
>>
>> __attribute__((noinline)) void call_me(int loopvalue) {
>> g = loopvalue;
>> }
>>
>>
>> // Invoke call_me once when done_call is false.
>> // This is a case where the loop executes once everytime the function
>> // is called.
>>
>> __attribute__((noinline)) void always_call(int max) {
>> for (int i = 0; i < max; ++i)
>> call_me(looparr[i]);
>> }
>>
>>
>> // Invoke call_me once when done_call is false and index is odd.
>> // This is a case where the loop does not execute even once half of the
>> // time because of the index.
>> __attribute__((noinline)) void call_on_odd(int max, int index) {
>> for (int i = 0; i < max && index % 2; ++i)
>> call_me(looparr[max]);
>> }
>>
>>
>> int main() {
>> const int max = 1;
>> for (int i = 0; i < 128; ++i) {
>> always_call(max);
>> call_on_odd(max, i);
>> }
>> return 0;
>> }
>>
>> On Fri, Jan 13, 2017 at 9:36 AM, Xin Tong <trent.xin.tong at gmail.com>
>> wrote:
>> > On Fri, Jan 13, 2017 at 9:34 AM, Dehao Chen <danielcdh at gmail.com> wrote:
>> >> OK, but the with the new patch, the generated assembly file is
>> >> identical
>> >> with the assembly generated by a clean compiler (without the patch).
>> >>
>> >> Dehao
>> >
>> > Yes, there is something in branchfolding in llc that folds away some
>> > of the branches. I am looking at why.
>> >
>> > -Xin
>> >>
>> >> On Fri, Jan 13, 2017 at 9:21 AM, Xin Tong <trent.xin.tong at gmail.com>
>> >> wrote:
>> >>>
>> >>> On Fri, Jan 13, 2017 at 7:36 AM, Dehao Chen via Phabricator
>> >>> <reviews at reviews.llvm.org> wrote:
>> >>> > danielcdh added a comment.
>> >>> >
>> >>> > Looks like this patch will make the "always call" worse:
>> >>> >
>> >>> > Without this patch:
>> >>> >
>> >>> > pushq %rbx
>> >>> > movq %rdi, %rbx
>> >>> > cmpl $0, (%rbx)
>> >>> > jne .LBB1_3
>> >>> >
>> >>> > .LBB1_1: # =>This Inner Loop Header:
>> >>> > Depth=1
>> >>> >
>> >>> > movq %rbx, %rdi
>> >>> > callq call_me
>> >>> > cmpl $0, (%rbx)
>> >>> > je .LBB1_1
>> >>> >
>> >>> > .LBB1_3:
>> >>> >
>> >>> > popq %rbx
>> >>> > retq
>> >>> >
>> >>> > With this patch:
>> >>> >
>> >>> > pushq %rbx
>> >>> > movq %rdi, %rbx
>> >>> > cmpl $0, (%rbx)
>> >>> > je .LBB1_1
>> >>> >
>> >>> > .LBB1_3:
>> >>> >
>> >>> > popq %rbx
>> >>> > retq
>> >>> >
>> >>> > .LBB1_1: # =>This Inner Loop Header:
>> >>> > Depth=1
>> >>> >
>> >>> > movq %rbx, %rdi
>> >>> > callq call_me
>> >>> > cmpl $0, (%rbx)
>> >>> > jne .LBB1_3
>> >>> > jmp .LBB1_1
>> >>> >
>> >>> > As the trip count of this loop is always 1, the first code will have
>> >>> > no
>> >>> > taken branches, while with this patch, it will have 2 taken
>> >>> > branches.
>> >>> >
>> >>> >
>> >>> > https://reviews.llvm.org/D28593
>> >>> >
>> >>> >
>> >>> >
>> >>>
>> >>> Sorry there is a mistaken updating the branch weight for the guard
>> >>> block. I fixed it. what i said before still hold true.
>> >>> -Xin
>> >>
>> >>
>
>
More information about the llvm-commits
mailing list