[PATCH] Fix a performance problem in gep(gep ...) merging

Wed Apr 15 12:52:36 PDT 2015

On Wed, Apr 15, 2015 at 12:16 PM, Xinliang David Li <xinliangli at gmail.com>
wrote:

> Mark, sounds great. Can you send the patch to Wei for testing ?
>

Patch is attached.  Not yet ready for review though ;-)

For the attached patch I added it to the x86 target.  Currently I don't
plan to do that when I check in the patch.  It'll just be added to those
targets which use the existing pass which this transformation is added to.
It can be extended to other targets later.  The patch breaks a few x86
tests maybe because the codegen pipeline is now different (also has CSE and
LICM) or maybe because there is a bug in my code.

Mark

>
> David
>
> On Wed, Apr 15, 2015 at 11:34 AM, Mark Heffernan <meheff at google.com>
> wrote:
>
>> There was a discussion on the mailing list about the exact issue this
>> patch is addressing.  We (Google team working on NVPTX backend) ran into
>> this same issue as did someone (Francois Pichet) working with the ppc
>> backend.  Here's the link to the discussion:
>>
>> http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-March/083398.html
>>
>> I initially proposed what your patch does.  That is, don't merge geps
>> unless the indices are constant (in which case the add is folded so no
>> additional instructions are potentially created in loops).  However, after
>> some discussion we came to the conclusion that the gep merging is a good
>> canonicalization which we want to do.  Instead a better approach might be
>> to split GEPs (undoing the early GEP merge) very late, right before
>> codegen, if it is determined to be profitable.  This way, all the
>> intermediate passes get the advantage of the canonicalization, while still
>> being able to do LICM/CSE on GEP subexpressions which the gep merge had
>> prevented.
>>
>> I have a patch I am in the last stages of testing which will perform this
>> optimization in the pass SeparateConstOffsetFromGEP (to be renamed
>> ReassociateGEPs... or maybe SplitGEPs now that I think about it).  I ran my
>> patch with your example from bug 23163 and it looks to perform better than
>> the prevent gep merging approach does.  With head compiler I get 29
>> instructions in the inner loop (this matches what you report in the bug).
>> You mention disabling merging reduces the loop instructions down to 25
>> instructions.  With the patch I am working on, it gets reduced down to 16
>> instructions.
>>
>> The pass is currently enabled only for select backends: nvptx, arm (-O3),
>> and ppc (-O3).   For testing with your bug I added it in x86 addIRPasses()
>> followed by CSE and LICM (these extra passes do nothing on your example
>> without the GEP splitting pass).
>>
>> I should have the patch out for review later in the week.
>>
>> Sorry for the duplicate effort here.  I should have filed a bug on this
>> issue before diving in :-(
>>
>> Mark
>>
>>
>> REPOSITORY
>>   rL LLVM
>>
>> http://reviews.llvm.org/D8911
>>
>> EMAIL PREFERENCES
>>   http://reviews.llvm.org/settings/panel/emailpreferences/
>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150415/fffc0f6a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: reassociategeps.patch
Type: text/x-patch
Size: 145852 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150415/fffc0f6a/attachment.bin>