[llvm] r289377 - [X86][InstCombine] Add support for scalar FMA intrinsics to SimplifyDemandedVectorElts.
Friedman, Eli via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 12 11:46:32 PST 2016
On 12/11/2016 12:54 AM, Craig Topper via llvm-commits wrote:
> Author: ctopper
> Date: Sun Dec 11 02:54:52 2016
> New Revision: 289377
>
> URL: http://llvm.org/viewvc/llvm-project?rev=289377&view=rev
> Log:
> [X86][InstCombine] Add support for scalar FMA intrinsics to SimplifyDemandedVectorElts.
>
> This teaches SimplifyDemandedElts that the FMA can be removed if the lower element isn't used. It also teaches it that if upper elements of the first operand aren't used then we can simplify them.
>
> Modified:
> llvm/trunk/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
> llvm/trunk/test/Transforms/InstCombine/x86-fma.ll
>
> Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp?rev=289377&r1=289376&r2=289377&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp (original)
> +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp Sun Dec 11 02:54:52 2016
> @@ -981,6 +981,7 @@ Value *InstCombiner::SimplifyDemandedVec
>
> bool MadeChange = false;
> APInt UndefElts2(VWidth, 0);
> + APInt UndefElts3(VWidth, 0);
> Value *TmpV;
> switch (I->getOpcode()) {
> default: break;
> @@ -1298,6 +1299,34 @@ Value *InstCombiner::SimplifyDemandedVec
> UndefElts &= UndefElts2;
> break;
>
> + case Intrinsic::x86_fma_vfmadd_ss:
> + case Intrinsic::x86_fma_vfmsub_ss:
> + case Intrinsic::x86_fma_vfnmadd_ss:
> + case Intrinsic::x86_fma_vfnmsub_ss:
> + case Intrinsic::x86_fma_vfmadd_sd:
> + case Intrinsic::x86_fma_vfmsub_sd:
> + case Intrinsic::x86_fma_vfnmadd_sd:
> + case Intrinsic::x86_fma_vfnmsub_sd:
> + TmpV = SimplifyDemandedVectorElts(II->getArgOperand(0), DemandedElts,
> + UndefElts, Depth + 1);
> + if (TmpV) { II->setArgOperand(0, TmpV); MadeChange = true; }
> + TmpV = SimplifyDemandedVectorElts(II->getArgOperand(1), DemandedElts,
> + UndefElts2, Depth + 1);
> + if (TmpV) { II->setArgOperand(1, TmpV); MadeChange = true; }
> + TmpV = SimplifyDemandedVectorElts(II->getArgOperand(2), DemandedElts,
> + UndefElts3, Depth + 1);
> + if (TmpV) { II->setArgOperand(2, TmpV); MadeChange = true; }
> +
> + // If lowest element of a scalar op isn't used then use Arg0.
> + if (DemandedElts.getLoBits(1) != 1)
> + return II->getArgOperand(0);
> +
> + // Output elements are undefined if all three are undefined. Consider
> + // things like undef&0. The result is known zero, not undef.
> + UndefElts &= UndefElts2;
> + UndefElts &= UndefElts3;
> + break;
> +
This looks really weird... for the second and third operands, shouldn't
we only be demanding the bottom element? In the same way, we only care
whether the bottom element is undef.
-Eli
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
More information about the llvm-commits
mailing list