[PATCH] Fix a performance problem in gep(gep ...) merging

Tue Apr 21 09:52:17 PDT 2015

Ok, I will submit the patch later today, and then collect the stat data.

Thanks,
Wei.

On Tue, Apr 21, 2015 at 9:02 AM, Xinliang David Li <xinliangli at gmail.com> wrote:
> Given the performance data and analysis, I suggest submit your original
> patch that is already reviewed if there are no other objections. As a follow
> up, please also add GEP depth limit stats and collect some data.
>
> thanks,
>
> David
>
>
> On Mon, Apr 20, 2015 at 9:41 PM, Wei Mi <wmi at google.com> wrote:
>>
>> I did some performance testing for the reassociategep patch proposed
>> by Mark. The testing result was mixed -- reassociategep patch was
>> better for some benchmarks and disabing gep merging was better for
>> some other benchmarks, and combining the two patches worked the best.
>> We did some perf analysis based on some testcases extracted from those
>> benchmarks with the largest perf difference between the two patches:
>>
>> * The testcase 1.cc in https://llvm.org/bugs/show_bug.cgi?id=23163
>> showed reassociategep patch is needed:
>>
>> For the testcase 1.cc, there are some sext generated for array address
>> computation. Existing CSE/LICM
>> cannot find the common expression separated by "sext". A special
>> reassociation is required.
>>
>> loop:
>> ... = sext(a + 1) + c  // a and c are loop invariant.
>> ...
>> ... = sext(a - 1) + c
>> end loop.
>>
>> CSE/LICM cannot find that "sext(a) + c" is a common expression and a
>> loop invariant expression. reassociategep does the reassociation so
>> CSE/LICM can optimize it.
>>
>> When I changed the testcase a little bit to remove the sext needed,
>> disabling gep merging patch generated the same code with
>> reassociategep patch.
>>
>> <   int c = p1.m_fn1(), x;
>> ---
>> >   long c = p1.m_fn1(), x;
>>
>> * The testcase 2.cc attached showed why disabling gep merging is needed.
>> The command compiling 2.cc is:
>> ~/workarea/llvm-r234390/build/bin/clang++ -O2 -std=c++11
>> -fno-omit-frame-pointer -fno-strict-aliasing -funsigned-char -S 2.cc
>> -o 2.s
>>
>> For the kernel loop, the IR before Codegen Prepare using the
>> reassociategep patch:
>>
>> for.body10:
>>   ...
>>   %add.ptr.sum = add nsw i64 %mul14, %idx.ext
>>   %add.ptr16 = getelementptr inbounds i32, i32* %p1, i64 %add.ptr.sum
>>   %15 = load i32, i32* %add.ptr16, align 4
>>   ...
>>   %add.ptr16.sum = add nsw i64 %add.ptr.sum, %idxprom
>>   %arrayidx19 = getelementptr inbounds i32, i32* %p1, i64 %add.ptr16.sum
>>   %17 = load i32, i32* %arrayidx19, align 4
>>   ...
>>   %add.ptr16.sum82 = add nsw i64 %mul22, %add.ptr.sum
>>   %add.ptr24 = getelementptr inbounds i32, i32* %p1, i64 %add.ptr16.sum82
>>   %19 = load i32, i32* %add.ptr24, align 4
>>   %add.ptr24.sum = add nsw i64 %add.ptr16.sum82, %idxprom
>>   %arrayidx28 = getelementptr inbounds i32, i32* %p1, i64 %add.ptr24.sum
>>   %20 = load i32, i32* %arrayidx28, align 4
>>   ...
>>   br i1 %exitcond, label %for.cond8.for.end_crit_edge, label %for.body10
>>
>> The IR before Codegen Prepare using the disabling gep patch:
>>
>> for.body10:
>>   ...
>>   %add.ptr = getelementptr inbounds i32, i32* %p1, i64 %idx.ext
>>   ...
>>   %add.ptr16 = getelementptr inbounds i32, i32* %add.ptr, i64 %mul14
>>   %15 = load i32, i32* %add.ptr16, align 4
>>   ...
>>   %arrayidx19 = getelementptr inbounds i32, i32* %add.ptr16, i64 %idxprom
>>   %17 = load i32, i32* %arrayidx19, align 4
>>   ...
>>   %add.ptr24 = getelementptr inbounds i32, i32* %add.ptr16, i64 %mul22
>>   %19 = load i32, i32* %add.ptr24, align 4
>>   %arrayidx28 = getelementptr inbounds i32, i32* %add.ptr24, i64 %idxprom
>>   %20 = load i32, i32* %arrayidx28, align 4
>>   ...
>>   br i1 %exitcond, label %for.cond8.for.end_crit_edge, label %for.body10
>>
>> gep merging generated several add insns which cannot be removed by
>> reassociategep patch. It is difficult for reassociation to remove
>> various  redundencies introduced by aggressive gep merging.
>>
>> I saw Mark already posted his patch for review:
>> http://reviews.llvm.org/D9136. I will update the patch in next mail to
>> fix some testcases.
>>
>> Thanks,
>> Wei.
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>