[PATCH] Optimize redundant insertvalue instructions

Thu May 8 01:27:55 PDT 2014

Hi,

Yesterday Juergen gave a couple or remarks, so here is a small add-on for the patch.

Besides some nits, I changed a check from (U->getType() != I.getType()) to (U->getOperand(0) != V). That check should guard us from propagating results in cases like the following (this case was also added to the test):
  %0 = insertvalue { i16, i32 } undef, i16 %x, 0
  %1 = insertvalue { i8, {i16, i32} } undef, { i16, i32 } %0, 1
Here %0 has only one use and this use is also insertvalue, but that’s not the case we’re trying to catch. Though the original check also dealt with that, I think the new version is more strict and clear.

Is it ok for trunk?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: redundant-insertvalue-followup.patch
Type: application/octet-stream
Size: 1789 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140508/3362a76a/attachment.obj>
-------------- next part --------------

Thanks,
Michael

On May 7, 2014, at 5:26 PM, Michael Zolotukhin <mzolotukhin at apple.com> wrote:

> 
> On May 7, 2014, at 5:03 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
>> ----- Original Message -----
>>> From: "Michael Zolotukhin" <mzolotukhin at apple.com>
>>> To: "Hal J. Finkel" <hfinkel at anl.gov>
>>> Cc: "LLVM Commits" <llvm-commits at cs.uiuc.edu>
>>> Sent: Wednesday, May 7, 2014 7:07:35 AM
>>> Subject: Re: [PATCH] Optimize redundant insertvalue instructions
>>> 
>>> 
>>> 
>>> Hi Hal,
>>> 
>>> Thank you for the review, I updated the patch accordingly to your
>>> remarks.
>>> 
>>> There is a more general alternative to the cut-off: we could move
>>> from uses to defs, maintaining a bitvector, that indicates which
>>> fields were written. In this case the complexity would be O(N) and
>>> we will also catch partially overlapping accesses as well. I did
>>> something like this in my ‘sophisticated’ implementation.
>>> 
>>> However, I’d prefer the most simple implementation, so I think that
>>> cut-off would be the best option here. I chose limit 10 - I don’t
>>> expect long chains of insertvalue instructions anyway, and it is
>>> enough for the original test.
>> 
>> Fair enough.
>> 
>>> 
>>> Here is the new patch:
>>> 
>>> 
>> 
>> I don't actually see a call to visitInsertValueInst being inserted by the patch; I suppose you're putting it somewhere appropriate or else the test case would fail, right? Assuming that's the case, LGTM.
>> 
> Thanks.
> There is no need to explicitly call visitInsertValueInst. We use InstVisitor interface (there is a call ‘visit’ in InstCombiner::DoOneIteration). The function ‘visit' is a big switch, constructed by macroses from IR/Instruction.def, and once we override visitInsertValueInst, it starts to be called. That’s how I understand this - if I’m mistaken I’d appreciate any corrections here. And yes, the test will fail otherwise.
> 
> Michael
> 
>> -Hal
>> 
>>> 
>>> 
>>> 
>>> Thanks,
>>> Michael
>>> 
>>> On May 7, 2014, at 3:01 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>> 
>>>> Michael,
>>>> 
>>>> +/// Here the second instruction inserts values at the same
>>>> indices, as the
>>>> 
>>>> +/// first one, making it redundant.
>>>> 
>>>> You mean, "making the first one redundant."
>>>> 
>>>> + Value *V = dyn_cast<Value>(&I);
>>>> 
>>>> There should be no cast necessary here (and it certainly shouldn't
>>>> be a dyn_cast).
>>>> 
>>>> + V = dyn_cast<Value>(UserInsInst);
>>>> 
>>>> Same here.
>>>> 
>>>> + while (V->hasOneUse()) {
>>>> 
>>>> Should we have a cut-off here? If this function is called on every
>>>> InsertValue instruction, then this algorithm will be O(N^2) in the
>>>> size of the InsertValue chain.
>>>> 
>>>> -Hal
>>>> 
>>>> ----- Original Message -----
>>>>> From: "Michael Zolotukhin" <mzolotukhin at apple.com>
>>>>> To: "LLVM Commits" <llvm-commits at cs.uiuc.edu>
>>>>> Sent: Tuesday, May 6, 2014 9:42:12 AM
>>>>> Subject: [PATCH] Optimize redundant insertvalue instructions
>>>>> 
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> 
>>>>> This patch implements a simple optimization of redundant
>>>>> insertvalue
>>>>> instructions.
>>>>> 
>>>>> 
>>>>> Here is a small example where it will help:
>>>>> 
>>>>> %0 = insertvalue { i8, i32 } undef, i8 %x, 0
>>>>> %1 = insertvalue { i8, i32 } %0, i8 %y, 0
>>>>> 
>>>>> 
>>>>> In this example, the second instruction inserts values at the same
>>>>> indices, as the first one, and consequently, the first instruction
>>>>> is redundant.
>>>>> This code will be transformed to:
>>>>> 
>>>>> %0 = insertvalue { i8, i32 } undef, i8 %y, 0
>>>>> 
>>>>> 
>>>>> This optimization catches nothing on specs and test-suite, but the
>>>>> example was inspired by a real world application.
>>>>> 
>>>>> 
>>>>> As an experiment, I tried a more sophisticated implementation,
>>>>> which
>>>>> was capable of dealing with several users and with partially
>>>>> overlapping indices, but that also caught nothing on specs (and
>>>>> for
>>>>> my application the simple version was enough). If there is an
>>>>> interest, I could share a patch with this more complicated
>>>>> implementation as well.
>>>>> 
>>>>> 
>>>>> Is it ok to commit this patch?
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Michael
>>>> 
>>>> --
>>>> Hal Finkel
>>>> Assistant Computational Scientist
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>> 
>>> 
>> 
>> -- 
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits