[PATCH] Optimize redundant insertvalue instructions

Thu May 8 09:18:20 PDT 2014

Hi Michael,

thanks for fixing the tiny nitpicks :-)

Let me elaborate a little bit more on the one-use removal idea I mentioned off-list.
%0 = insertvalue { i8, i32 } undef, i8 0, 0
%1 = insertvalue { i8, i32 } %0, i8 1, 0
%2 = insertvalue { i8, i32 } %0, i8 2, 0

Currently you have to check every time that there is only one use when you traverse the def-use chain and
you need to make sure that it is the same type, because the use might not be the first operand.

The idea I had in mind was to traverse the use-def chain instead. This way you are guaranteed to have only
one use and don’t need all that extra checking. Furthermore when you hit instruction %0 coming from
instruction %1, you can optimize as usual by using instruction’s %0 first operand. Now when InstCombine
does the same for instruction %2, then instruction %0 won’t have any uses anymore and is trivially dead.
By adding %0 to the worklist again will automatically clean it up for you.

Cheers,
Juergen

On May 8, 2014, at 1:27 AM, Michael Zolotukhin <mzolotukhin at apple.com> wrote:

> Hi,
> 
> Yesterday Juergen gave a couple or remarks, so here is a small add-on for the patch.
> 
> Besides some nits, I changed a check from (U->getType() != I.getType()) to (U->getOperand(0) != V). That check should guard us from propagating results in cases like the following (this case was also added to the test):
>  %0 = insertvalue { i16, i32 } undef, i16 %x, 0
>  %1 = insertvalue { i8, {i16, i32} } undef, { i16, i32 } %0, 1
> Here %0 has only one use and this use is also insertvalue, but that’s not the case we’re trying to catch. Though the original check also dealt with that, I think the new version is more strict and clear.
> 
> Is it ok for trunk?
> 
> <redundant-insertvalue-followup.patch>
> 
> Thanks,
> Michael
> 
> On May 7, 2014, at 5:26 PM, Michael Zolotukhin <mzolotukhin at apple.com> wrote:
> 
>> 
>> On May 7, 2014, at 5:03 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>> 
>>> ----- Original Message -----
>>>> From: "Michael Zolotukhin" <mzolotukhin at apple.com>
>>>> To: "Hal J. Finkel" <hfinkel at anl.gov>
>>>> Cc: "LLVM Commits" <llvm-commits at cs.uiuc.edu>
>>>> Sent: Wednesday, May 7, 2014 7:07:35 AM
>>>> Subject: Re: [PATCH] Optimize redundant insertvalue instructions
>>>> 
>>>> 
>>>> 
>>>> Hi Hal,
>>>> 
>>>> Thank you for the review, I updated the patch accordingly to your
>>>> remarks.
>>>> 
>>>> There is a more general alternative to the cut-off: we could move
>>>> from uses to defs, maintaining a bitvector, that indicates which
>>>> fields were written. In this case the complexity would be O(N) and
>>>> we will also catch partially overlapping accesses as well. I did
>>>> something like this in my ‘sophisticated’ implementation.
>>>> 
>>>> However, I’d prefer the most simple implementation, so I think that
>>>> cut-off would be the best option here. I chose limit 10 - I don’t
>>>> expect long chains of insertvalue instructions anyway, and it is
>>>> enough for the original test.
>>> 
>>> Fair enough.
>>> 
>>>> 
>>>> Here is the new patch:
>>>> 
>>>> 
>>> 
>>> I don't actually see a call to visitInsertValueInst being inserted by the patch; I suppose you're putting it somewhere appropriate or else the test case would fail, right? Assuming that's the case, LGTM.
>>> 
>> Thanks.
>> There is no need to explicitly call visitInsertValueInst. We use InstVisitor interface (there is a call ‘visit’ in InstCombiner::DoOneIteration). The function ‘visit' is a big switch, constructed by macroses from IR/Instruction.def, and once we override visitInsertValueInst, it starts to be called. That’s how I understand this - if I’m mistaken I’d appreciate any corrections here. And yes, the test will fail otherwise.
>> 
>> Michael
>> 
>>> -Hal
>>> 
>>>> 
>>>> 
>>>> 
>>>> Thanks,
>>>> Michael
>>>> 
>>>> On May 7, 2014, at 3:01 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>>> 
>>>>> Michael,
>>>>> 
>>>>> +/// Here the second instruction inserts values at the same
>>>>> indices, as the
>>>>> 
>>>>> +/// first one, making it redundant.
>>>>> 
>>>>> You mean, "making the first one redundant."
>>>>> 
>>>>> + Value *V = dyn_cast<Value>(&I);
>>>>> 
>>>>> There should be no cast necessary here (and it certainly shouldn't
>>>>> be a dyn_cast).
>>>>> 
>>>>> + V = dyn_cast<Value>(UserInsInst);
>>>>> 
>>>>> Same here.
>>>>> 
>>>>> + while (V->hasOneUse()) {
>>>>> 
>>>>> Should we have a cut-off here? If this function is called on every
>>>>> InsertValue instruction, then this algorithm will be O(N^2) in the
>>>>> size of the InsertValue chain.
>>>>> 
>>>>> -Hal
>>>>> 
>>>>> ----- Original Message -----
>>>>>> From: "Michael Zolotukhin" <mzolotukhin at apple.com>
>>>>>> To: "LLVM Commits" <llvm-commits at cs.uiuc.edu>
>>>>>> Sent: Tuesday, May 6, 2014 9:42:12 AM
>>>>>> Subject: [PATCH] Optimize redundant insertvalue instructions
>>>>>> 
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> 
>>>>>> This patch implements a simple optimization of redundant
>>>>>> insertvalue
>>>>>> instructions.
>>>>>> 
>>>>>> 
>>>>>> Here is a small example where it will help:
>>>>>> 
>>>>>> %0 = insertvalue { i8, i32 } undef, i8 %x, 0
>>>>>> %1 = insertvalue { i8, i32 } %0, i8 %y, 0
>>>>>> 
>>>>>> 
>>>>>> In this example, the second instruction inserts values at the same
>>>>>> indices, as the first one, and consequently, the first instruction
>>>>>> is redundant.
>>>>>> This code will be transformed to:
>>>>>> 
>>>>>> %0 = insertvalue { i8, i32 } undef, i8 %y, 0
>>>>>> 
>>>>>> 
>>>>>> This optimization catches nothing on specs and test-suite, but the
>>>>>> example was inspired by a real world application.
>>>>>> 
>>>>>> 
>>>>>> As an experiment, I tried a more sophisticated implementation,
>>>>>> which
>>>>>> was capable of dealing with several users and with partially
>>>>>> overlapping indices, but that also caught nothing on specs (and
>>>>>> for
>>>>>> my application the simple version was enough). If there is an
>>>>>> interest, I could share a patch with this more complicated
>>>>>> implementation as well.
>>>>>> 
>>>>>> 
>>>>>> Is it ok to commit this patch?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Michael
>>>>> 
>>>>> --
>>>>> Hal Finkel
>>>>> Assistant Computational Scientist
>>>>> Leadership Computing Facility
>>>>> Argonne National Laboratory
>>>> 
>>>> 
>>> 
>>> -- 
>>> Hal Finkel
>>> Assistant Computational Scientist
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>