[PATCH] D37236: [InstCombine] add insertelement + shuffle demanded element fold

Sanjay Patel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 29 08:28:06 PDT 2017


spatel added a comment.

In https://reviews.llvm.org/D37236#854816, @craig.topper wrote:

> What about
>
>   --- a/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
>   +++ b/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
>   @@ -996,19 +996,21 @@ Value *InstCombiner::SimplifyDemandedVectorElts(Value *V, APInt DemandedElts,
>        // If this is inserting an element that isn't demanded, remove this
>        // insertelement.
>        unsigned IdxNo = Idx->getZExtValue();
>   -    if (IdxNo >= VWidth || !DemandedElts[IdxNo]) {
>   -      Worklist.Add(I);
>   -      return I->getOperand(0);
>   -    }
>  
>        // Otherwise, the element inserted overwrites whatever was there, so the
>        // input demanded set is simpler than the output set.
>        APInt DemandedElts2 = DemandedElts;
>   -    DemandedElts2.clearBit(IdxNo);
>   +    if (IdxNo < VWidth)
>   +      DemandedElts2.clearBit(IdxNo);
>        TmpV = SimplifyDemandedVectorElts(I->getOperand(0), DemandedElts2,
>                                          UndefElts, Depth + 1);
>        if (TmpV) { I->setOperand(0, TmpV); MadeChange = true; }
>  
>   +    if (IdxNo >= VWidth || !DemandedElts[IdxNo]) {
>   +      Worklist.Add(I);
>   +      return I->getOperand(0);
>   +    }
>   +
>        // The inserted element is defined.
>        UndefElts.clearBit(IdxNo);
>        break;
>


This patch will work on the examples given here, so maybe we want to do it anyway, but I think it's more fragile.

Ie, it works for a chain of ops that we know how to handle here in SimplifyDemandedVectorElts(), but as soon as something unexpected appears in the pattern, we might fail again. Here's the smallest variant that I could come up with to show the problem:

  define <4 x float> @inselt_shuf_no_demand_bogus_insert_index_in_chain(float %a1, float %a2, float %a3, i32 %variable_index) {
    %out1 = insertelement <4 x float> undef, float %a1, i32 1
    %out12 = insertelement <4 x float> %out1, float %a2, i32 undef ; something unexpected
    %out123 = insertelement <4 x float> %out12, float %a3, i32 3
    %shuffle = shufflevector <4 x float> %out123, <4 x float> undef, <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef>
    ret <4 x float> %shuffle
  }

This will be left as:

  %out1 = insertelement <4 x float> undef, float %a1, i32 1
  ret <4 x float> %out1

Even if we make the other fix in visitShuffleVectorInst() to return instead of set MadeChange, this still won't get folded because %out1 has multiple uses.


https://reviews.llvm.org/D37236





More information about the llvm-commits mailing list