[PATCH] D37236: [InstCombine] add insertelement + shuffle demanded element fold
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 29 08:28:06 PDT 2017
spatel added a comment.
In https://reviews.llvm.org/D37236#854816, @craig.topper wrote:
> What about
>
> --- a/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
> +++ b/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
> @@ -996,19 +996,21 @@ Value *InstCombiner::SimplifyDemandedVectorElts(Value *V, APInt DemandedElts,
> // If this is inserting an element that isn't demanded, remove this
> // insertelement.
> unsigned IdxNo = Idx->getZExtValue();
> - if (IdxNo >= VWidth || !DemandedElts[IdxNo]) {
> - Worklist.Add(I);
> - return I->getOperand(0);
> - }
>
> // Otherwise, the element inserted overwrites whatever was there, so the
> // input demanded set is simpler than the output set.
> APInt DemandedElts2 = DemandedElts;
> - DemandedElts2.clearBit(IdxNo);
> + if (IdxNo < VWidth)
> + DemandedElts2.clearBit(IdxNo);
> TmpV = SimplifyDemandedVectorElts(I->getOperand(0), DemandedElts2,
> UndefElts, Depth + 1);
> if (TmpV) { I->setOperand(0, TmpV); MadeChange = true; }
>
> + if (IdxNo >= VWidth || !DemandedElts[IdxNo]) {
> + Worklist.Add(I);
> + return I->getOperand(0);
> + }
> +
> // The inserted element is defined.
> UndefElts.clearBit(IdxNo);
> break;
>
This patch will work on the examples given here, so maybe we want to do it anyway, but I think it's more fragile.
Ie, it works for a chain of ops that we know how to handle here in SimplifyDemandedVectorElts(), but as soon as something unexpected appears in the pattern, we might fail again. Here's the smallest variant that I could come up with to show the problem:
define <4 x float> @inselt_shuf_no_demand_bogus_insert_index_in_chain(float %a1, float %a2, float %a3, i32 %variable_index) {
%out1 = insertelement <4 x float> undef, float %a1, i32 1
%out12 = insertelement <4 x float> %out1, float %a2, i32 undef ; something unexpected
%out123 = insertelement <4 x float> %out12, float %a3, i32 3
%shuffle = shufflevector <4 x float> %out123, <4 x float> undef, <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef>
ret <4 x float> %shuffle
}
This will be left as:
%out1 = insertelement <4 x float> undef, float %a1, i32 1
ret <4 x float> %out1
Even if we make the other fix in visitShuffleVectorInst() to return instead of set MadeChange, this still won't get folded because %out1 has multiple uses.
https://reviews.llvm.org/D37236
More information about the llvm-commits
mailing list