<div dir="ltr">That list of failures looks very strange. I don't know how my change could have affected such a varied list of tests. Especially the llvm-rc tests. That shouldn't have anything to do with intrinsics.<div><br></div><div><br clear="all"><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">~Craig</div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr">On Thu, Jul 12, 2018 at 6:30 PM Galina Kistanova via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hello Craig,<br><br>This commit broke tests on one of our builders:<br><div><a href="http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/18419/steps/test/logs/stdio" target="_blank">http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/18419/steps/test/logs/stdio</a></div><div><br></div>. . .<br>Failing Tests (34):<br> Clang :: CodeGen/aarch64-inline-asm.c<br> Clang :: CodeGen/asm-inout.c<br> Clang :: CodeGen/avx512-kconstraints-att_inline_asm.c<br> Clang :: CodeGen/bittest-intrin.c<br> Clang :: CodeGen/mips-constraint-regs.c<br> Clang :: CodeGen/mips-constraints-mem.c<br> Clang :: CodeGen/ms-inline-asm.c<br> Clang :: CodeGenCXX/microsoft-uuidof.cpp<br> Clang :: CodeGenCXX/ms-inline-asm-return.cpp<br> Clang :: CodeGenObjC/arc-arm.m<br> Clang :: CodeGenObjC/category-class.m<br> LLVM :: Bitcode/upgrade-objcretainrelease-asm.ll<br> LLVM :: Bitcode/upgrade-objcretainrelease.ll<br> LLVM :: CodeGen/MIR/X86/external-symbol-operands.mir<br> LLVM :: CodeGen/X86/xray-attribute-instrumentation.ll<br> LLVM :: CodeGen/X86/xray-loop-detection.ll<br> LLVM :: CodeGen/X86/xray-tail-call-sled.ll<br> LLVM :: DebugInfo/PDB/dbi-bytes.test<br> LLVM :: DebugInfo/PDB/pdbdump-raw-bytes.test<br> LLVM :: DebugInfo/PDB/pdbdump-raw-stream.test<br> LLVM :: MC/AsmParser/directive_ascii.s<br> LLVM :: MC/ELF/debug-line2.s<br> LLVM :: MC/MachO/variable-exprs.s<br> LLVM :: Transforms/ObjCARC/contract-marker.ll<br> LLVM :: Transforms/ObjCARC/contract-testcases.ll<br> LLVM :: tools/llvm-rc/tag-accelerators.test<br> LLVM :: tools/llvm-rc/tag-dialog.test<br> LLVM :: tools/llvm-rc/tag-escape.test<br> LLVM :: tools/llvm-rc/tag-icon-cursor.test<br> LLVM :: tools/llvm-rc/tag-stringtable.test<br> LLVM :: tools/llvm-rc/tag-versioninfo.test<br> LLVM :: tools/llvm-readobj/sections-ext.test<br> lld :: COFF/combined-resources.test<br> lld :: COFF/resource.test<br><br>It is not good idea to keep the bot red for too long. This hides new problem which later hard to track down.<br>Please have a look ASAP?<br><br>Thanks<br><br>Galina<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jul 11, 2018 at 5:29 PM, Craig Topper via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: ctopper<br>
Date: Wed Jul 11 17:29:56 2018<br>
New Revision: 336871<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=336871&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=336871&view=rev</a><br>
Log:<br>
[X86] Remove and autoupgrade the scalar fma intrinsics with masking.<br>
<br>
This converts them to what clang is now using for codegen. Unfortunately, there seem to be a few kinks to work out still. I'll try to address with follow up patches.<br>
<br>
Modified:<br>
llvm/trunk/include/llvm/IR/IntrinsicsX86.td<br>
llvm/trunk/lib/IR/AutoUpgrade.cpp<br>
llvm/trunk/lib/Target/X86/X86ISelLowering.cpp<br>
llvm/trunk/lib/Target/X86/X86InstrAVX512.td<br>
llvm/trunk/lib/Target/X86/X86InstrFMA.td<br>
llvm/trunk/lib/Target/X86/X86IntrinsicsInfo.h<br>
llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp<br>
llvm/trunk/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp<br>
llvm/trunk/test/CodeGen/X86/avx512-intrinsics-upgrade.ll<br>
llvm/trunk/test/CodeGen/X86/avx512-intrinsics.ll<br>
llvm/trunk/test/CodeGen/X86/avx512-scalar_mask.ll<br>
llvm/trunk/test/CodeGen/X86/fma-fneg-combine.ll<br>
llvm/trunk/test/Transforms/InstCombine/X86/x86-avx512.ll<br>
<br>
Modified: llvm/trunk/include/llvm/IR/IntrinsicsX86.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/IntrinsicsX86.td?rev=336871&r1=336870&r2=336871&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/IntrinsicsX86.td?rev=336871&r1=336870&r2=336871&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/include/llvm/IR/IntrinsicsX86.td (original)<br>
+++ llvm/trunk/include/llvm/IR/IntrinsicsX86.td Wed Jul 11 17:29:56 2018<br>
@@ -1933,57 +1933,6 @@ let TargetPrefix = "x86" in { // All in<br>
[llvm_float_ty, llvm_float_ty, llvm_float_ty, llvm_i32_ty],<br>
[IntrNoMem]>;<br>
<br>
-<br>
- def int_x86_avx512_mask_vfmadd_sd : // FIXME: Remove<br>
- Intrinsic<[llvm_v2f64_ty],<br>
- [llvm_v2f64_ty, llvm_v2f64_ty, llvm_v2f64_ty, llvm_i8_ty,<br>
- llvm_i32_ty], [IntrNoMem]>;<br>
-<br>
- def int_x86_avx512_mask_vfmadd_ss : // FIXME: Remove<br>
- Intrinsic<[llvm_v4f32_ty],<br>
- [llvm_v4f32_ty, llvm_v4f32_ty, llvm_v4f32_ty, llvm_i8_ty,<br>
- llvm_i32_ty], [IntrNoMem]>;<br>
-<br>
- def int_x86_avx512_maskz_vfmadd_sd : // FIXME: Remove<br>
- Intrinsic<[llvm_v2f64_ty],<br>
- [llvm_v2f64_ty, llvm_v2f64_ty, llvm_v2f64_ty, llvm_i8_ty,<br>
- llvm_i32_ty], [IntrNoMem]>;<br>
-<br>
- def int_x86_avx512_maskz_vfmadd_ss : // FIXME: Remove<br>
- Intrinsic<[llvm_v4f32_ty],<br>
- [llvm_v4f32_ty, llvm_v4f32_ty, llvm_v4f32_ty, llvm_i8_ty,<br>
- llvm_i32_ty], [IntrNoMem]>;<br>
-<br>
- def int_x86_avx512_mask3_vfmadd_sd : // FIXME: Remove<br>
- Intrinsic<[llvm_v2f64_ty],<br>
- [llvm_v2f64_ty, llvm_v2f64_ty, llvm_v2f64_ty, llvm_i8_ty,<br>
- llvm_i32_ty], [IntrNoMem]>;<br>
-<br>
- def int_x86_avx512_mask3_vfmadd_ss : // FIXME: Remove<br>
- Intrinsic<[llvm_v4f32_ty],<br>
- [llvm_v4f32_ty, llvm_v4f32_ty, llvm_v4f32_ty, llvm_i8_ty,<br>
- llvm_i32_ty], [IntrNoMem]>;<br>
-<br>
- def int_x86_avx512_mask3_vfmsub_sd : // FIXME: Remove<br>
- Intrinsic<[llvm_v2f64_ty],<br>
- [llvm_v2f64_ty, llvm_v2f64_ty, llvm_v2f64_ty, llvm_i8_ty,<br>
- llvm_i32_ty], [IntrNoMem]>;<br>
-<br>
- def int_x86_avx512_mask3_vfmsub_ss : // FIXME: Remove<br>
- Intrinsic<[llvm_v4f32_ty],<br>
- [llvm_v4f32_ty, llvm_v4f32_ty, llvm_v4f32_ty, llvm_i8_ty,<br>
- llvm_i32_ty], [IntrNoMem]>;<br>
-<br>
- def int_x86_avx512_mask3_vfnmsub_sd : // FIXME: Remove<br>
- Intrinsic<[llvm_v2f64_ty],<br>
- [llvm_v2f64_ty, llvm_v2f64_ty, llvm_v2f64_ty, llvm_i8_ty,<br>
- llvm_i32_ty], [IntrNoMem]>;<br>
-<br>
- def int_x86_avx512_mask3_vfnmsub_ss : // FIXME: Remove<br>
- Intrinsic<[llvm_v4f32_ty],<br>
- [llvm_v4f32_ty, llvm_v4f32_ty, llvm_v4f32_ty, llvm_i8_ty,<br>
- llvm_i32_ty], [IntrNoMem]>;<br>
-<br>
def int_x86_avx512_vpmadd52h_uq_128 :<br>
GCCBuiltin<"__builtin_ia32_vpmadd52huq128">,<br>
Intrinsic<[llvm_v2i64_ty], [llvm_v2i64_ty, llvm_v2i64_ty,<br>
<br>
Modified: llvm/trunk/lib/IR/AutoUpgrade.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/AutoUpgrade.cpp?rev=336871&r1=336870&r2=336871&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/AutoUpgrade.cpp?rev=336871&r1=336870&r2=336871&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/IR/AutoUpgrade.cpp (original)<br>
+++ llvm/trunk/lib/IR/AutoUpgrade.cpp Wed Jul 11 17:29:56 2018<br>
@@ -81,17 +81,17 @@ static bool ShouldUpgradeX86Intrinsic(Fu<br>
Name.startswith("fma.vfmsubadd.") || // Added in 7.0<br>
Name.startswith("fma.vfnmadd.") || // Added in 7.0<br>
Name.startswith("fma.vfnmsub.") || // Added in 7.0<br>
- Name.startswith("avx512.mask.vfmadd.p") || // Added in 7.0<br>
- Name.startswith("avx512.mask.vfnmadd.p") || // Added in 7.0<br>
- Name.startswith("avx512.mask.vfnmsub.p") || // Added in 7.0<br>
- Name.startswith("avx512.mask3.vfmadd.p") || // Added in 7.0<br>
- Name.startswith("avx512.maskz.vfmadd.p") || // Added in 7.0<br>
- Name.startswith("avx512.mask3.vfmsub.p") || // Added in 7.0<br>
- Name.startswith("avx512.mask3.vfnmsub.p") || // Added in 7.0<br>
- Name.startswith("avx512.mask.vfmaddsub.p") || // Added in 7.0<br>
- Name.startswith("avx512.maskz.vfmaddsub.p") || // Added in 7.0<br>
- Name.startswith("avx512.mask3.vfmaddsub.p") || // Added in 7.0<br>
- Name.startswith("avx512.mask3.vfmsubadd.p") || // Added in 7.0<br>
+ Name.startswith("avx512.mask.vfmadd.") || // Added in 7.0<br>
+ Name.startswith("avx512.mask.vfnmadd.") || // Added in 7.0<br>
+ Name.startswith("avx512.mask.vfnmsub.") || // Added in 7.0<br>
+ Name.startswith("avx512.mask3.vfmadd.") || // Added in 7.0<br>
+ Name.startswith("avx512.maskz.vfmadd.") || // Added in 7.0<br>
+ Name.startswith("avx512.mask3.vfmsub.") || // Added in 7.0<br>
+ Name.startswith("avx512.mask3.vfnmsub.") || // Added in 7.0<br>
+ Name.startswith("avx512.mask.vfmaddsub.") || // Added in 7.0<br>
+ Name.startswith("avx512.maskz.vfmaddsub.") || // Added in 7.0<br>
+ Name.startswith("avx512.mask3.vfmaddsub.") || // Added in 7.0<br>
+ Name.startswith("avx512.mask3.vfmsubadd.") || // Added in 7.0<br>
Name.startswith("avx512.mask.shuf.i") || // Added in 6.0<br>
Name.startswith("avx512.mask.shuf.f") || // Added in 6.0<br>
Name.startswith("avx512.kunpck") || //added in 6.0 <br>
@@ -826,7 +826,7 @@ static Value *getX86MaskVec(IRBuilder<><br>
<br>
static Value *EmitX86Select(IRBuilder<> &Builder, Value *Mask,<br>
Value *Op0, Value *Op1) {<br>
- // If the mask is all ones just emit the align operation.<br>
+ // If the mask is all ones just emit the first operation.<br>
if (const auto *C = dyn_cast<Constant>(Mask))<br>
if (C->isAllOnesValue())<br>
return Op0;<br>
@@ -835,6 +835,21 @@ static Value *EmitX86Select(IRBuilder<><br>
return Builder.CreateSelect(Mask, Op0, Op1);<br>
}<br>
<br>
+static Value *EmitX86ScalarSelect(IRBuilder<> &Builder, Value *Mask,<br>
+ Value *Op0, Value *Op1) {<br>
+ // If the mask is all ones just emit the first operation.<br>
+ if (const auto *C = dyn_cast<Constant>(Mask))<br>
+ if (C->isAllOnesValue())<br>
+ return Op0;<br>
+<br>
+ llvm::VectorType *MaskTy =<br>
+ llvm::VectorType::get(Builder.getInt1Ty(),<br>
+ Mask->getType()->getIntegerBitWidth());<br>
+ Mask = Builder.CreateBitCast(Mask, MaskTy);<br>
+ Mask = Builder.CreateExtractElement(Mask, (uint64_t)0);<br>
+ return Builder.CreateSelect(Mask, Op0, Op1);<br>
+}<br>
+<br>
// Handle autoupgrade for masked PALIGNR and VALIGND/Q intrinsics.<br>
// PALIGNR handles large immediates by shifting while VALIGN masks the immediate<br>
// so we need to handle both cases. VALIGN also doesn't have 128-bit lanes.<br>
@@ -2806,6 +2821,64 @@ void llvm::UpgradeIntrinsicCall(CallInst<br>
<br>
Rep = Builder.CreateInsertElement(Constant::getNullValue(CI->getType()),<br>
Rep, (uint64_t)0);<br>
+ } else if (IsX86 && (Name.startswith("avx512.mask.vfmadd.s") ||<br>
+ Name.startswith("avx512.maskz.vfmadd.s") ||<br>
+ Name.startswith("avx512.mask3.vfmadd.s") ||<br>
+ Name.startswith("avx512.mask3.vfmsub.s") ||<br>
+ Name.startswith("avx512.mask3.vfnmsub.s"))) {<br>
+ bool IsMask3 = Name[11] == '3';<br>
+ bool IsMaskZ = Name[11] == 'z';<br>
+ // Drop the "avx512.mask." to make it easier.<br>
+ Name = Name.drop_front(IsMask3 || IsMaskZ ? 13 : 12);<br>
+ bool NegMul = Name[2] == 'n';<br>
+ bool NegAcc = NegMul ? Name[4] == 's' : Name[3] == 's';<br>
+<br>
+ Value *A = CI->getArgOperand(0);<br>
+ Value *B = CI->getArgOperand(1);<br>
+ Value *C = CI->getArgOperand(2);<br>
+<br>
+ if (NegMul && (IsMask3 || IsMaskZ))<br>
+ A = Builder.CreateFNeg(A);<br>
+ if (NegMul && !(IsMask3 || IsMaskZ))<br>
+ B = Builder.CreateFNeg(B);<br>
+ if (NegAcc)<br>
+ C = Builder.CreateFNeg(C);<br>
+<br>
+ A = Builder.CreateExtractElement(A, (uint64_t)0);<br>
+ B = Builder.CreateExtractElement(B, (uint64_t)0);<br>
+ C = Builder.CreateExtractElement(C, (uint64_t)0);<br>
+<br>
+ if (!isa<ConstantInt>(CI->getArgOperand(4)) ||<br>
+ cast<ConstantInt>(CI->getArgOperand(4))->getZExtValue() != 4) {<br>
+ Value *Ops[] = { A, B, C, CI->getArgOperand(4) };<br>
+<br>
+ Intrinsic::ID IID;<br>
+ if (Name.back() == 'd')<br>
+ IID = Intrinsic::x86_avx512_vfmadd_f64;<br>
+ else<br>
+ IID = Intrinsic::x86_avx512_vfmadd_f32;<br>
+ Function *FMA = Intrinsic::getDeclaration(CI->getModule(), IID);<br>
+ Rep = Builder.CreateCall(FMA, Ops);<br>
+ } else {<br>
+ Function *FMA = Intrinsic::getDeclaration(CI->getModule(),<br>
+ Intrinsic::fma,<br>
+ A->getType());<br>
+ Rep = Builder.CreateCall(FMA, { A, B, C });<br>
+ }<br>
+<br>
+ Value *PassThru = IsMaskZ ? Constant::getNullValue(Rep->getType()) :<br>
+ IsMask3 ? C : A;<br>
+<br>
+ // For Mask3 with NegAcc, we need to create a new extractelement that<br>
+ // avoids the negation above.<br>
+ if (NegAcc && IsMask3)<br>
+ PassThru = Builder.CreateExtractElement(CI->getArgOperand(2),<br>
+ (uint64_t)0);<br>
+<br>
+ Rep = EmitX86ScalarSelect(Builder, CI->getArgOperand(3),<br>
+ Rep, PassThru);<br>
+ Rep = Builder.CreateInsertElement(CI->getArgOperand(IsMask3 ? 2 : 0),<br>
+ Rep, (uint64_t)0);<br>
} else if (IsX86 && (Name.startswith("avx512.mask.vfmadd.p") ||<br>
Name.startswith("avx512.mask.vfnmadd.p") ||<br>
Name.startswith("avx512.mask.vfnmsub.p") ||<br>
@@ -2820,6 +2893,17 @@ void llvm::UpgradeIntrinsicCall(CallInst<br>
bool NegMul = Name[2] == 'n';<br>
bool NegAcc = NegMul ? Name[4] == 's' : Name[3] == 's';<br>
<br>
+ Value *A = CI->getArgOperand(0);<br>
+ Value *B = CI->getArgOperand(1);<br>
+ Value *C = CI->getArgOperand(2);<br>
+<br>
+ if (NegMul && (IsMask3 || IsMaskZ))<br>
+ A = Builder.CreateFNeg(A);<br>
+ if (NegMul && !(IsMask3 || IsMaskZ))<br>
+ B = Builder.CreateFNeg(B);<br>
+ if (NegAcc)<br>
+ C = Builder.CreateFNeg(C);<br>
+<br>
if (CI->getNumArgOperands() == 5 &&<br>
(!isa<ConstantInt>(CI->getArgOperand(4)) ||<br>
cast<ConstantInt>(CI->getArgOperand(4))->getZExtValue() != 4)) {<br>
@@ -2830,38 +2914,13 @@ void llvm::UpgradeIntrinsicCall(CallInst<br>
else<br>
IID = Intrinsic::x86_avx512_vfmadd_pd_512;<br>
<br>
- Value *Ops[] = { CI->getArgOperand(0), CI->getArgOperand(1),<br>
- CI->getArgOperand(2), CI->getArgOperand(4) };<br>
-<br>
- if (NegMul) {<br>
- if (IsMaskZ || IsMask3)<br>
- Ops[0] = Builder.CreateFNeg(Ops[0]);<br>
- else<br>
- Ops[1] = Builder.CreateFNeg(Ops[1]);<br>
- }<br>
- if (NegAcc)<br>
- Ops[2] = Builder.CreateFNeg(Ops[2]);<br>
-<br>
Rep = Builder.CreateCall(Intrinsic::getDeclaration(F->getParent(), IID),<br>
- Ops);<br>
+ { A, B, C, CI->getArgOperand(4) });<br>
} else {<br>
-<br>
- Value *Ops[] = { CI->getArgOperand(0), CI->getArgOperand(1),<br>
- CI->getArgOperand(2) };<br>
-<br>
- if (NegMul) {<br>
- if (IsMaskZ || IsMask3)<br>
- Ops[0] = Builder.CreateFNeg(Ops[0]);<br>
- else<br>
- Ops[1] = Builder.CreateFNeg(Ops[1]);<br>
- }<br>
- if (NegAcc)<br>
- Ops[2] = Builder.CreateFNeg(Ops[2]);<br>
-<br>
Function *FMA = Intrinsic::getDeclaration(CI->getModule(),<br>
Intrinsic::fma,<br>
- Ops[0]->getType());<br>
- Rep = Builder.CreateCall(FMA, Ops);<br>
+ A->getType());<br>
+ Rep = Builder.CreateCall(FMA, { A, B, C });<br>
}<br>
<br>
Value *PassThru = IsMaskZ ? llvm::Constant::getNullValue(CI->getType()) :<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=336871&r1=336870&r2=336871&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=336871&r1=336870&r2=336871&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)<br>
+++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Jul 11 17:29:56 2018<br>
@@ -20710,39 +20710,6 @@ SDValue X86TargetLowering::LowerINTRINSI<br>
Src1, Src2, Src3),<br>
Mask, PassThru, Subtarget, DAG);<br>
}<br>
- case FMA_OP_SCALAR_MASK:<br>
- case FMA_OP_SCALAR_MASK3:<br>
- case FMA_OP_SCALAR_MASKZ: {<br>
- SDValue Src1 = Op.getOperand(1);<br>
- SDValue Src2 = Op.getOperand(2);<br>
- SDValue Src3 = Op.getOperand(3);<br>
- SDValue Mask = Op.getOperand(4);<br>
- MVT VT = Op.getSimpleValueType();<br>
- SDValue PassThru = SDValue();<br>
-<br>
- // set PassThru element<br>
- if (IntrData->Type == FMA_OP_SCALAR_MASKZ)<br>
- PassThru = getZeroVector(VT, Subtarget, DAG, dl);<br>
- else if (IntrData->Type == FMA_OP_SCALAR_MASK3)<br>
- PassThru = Src3;<br>
- else<br>
- PassThru = Src1;<br>
-<br>
- unsigned IntrWithRoundingModeOpcode = IntrData->Opc1;<br>
- if (IntrWithRoundingModeOpcode != 0) {<br>
- SDValue Rnd = Op.getOperand(5);<br>
- if (!isRoundModeCurDirection(Rnd))<br>
- return getScalarMaskingNode(DAG.getNode(IntrWithRoundingModeOpcode, dl,<br>
- Op.getValueType(), Src1, Src2,<br>
- Src3, Rnd),<br>
- Mask, PassThru, Subtarget, DAG);<br>
- }<br>
-<br>
- return getScalarMaskingNode(DAG.getNode(IntrData->Opc0, dl,<br>
- Op.getValueType(), Src1, Src2,<br>
- Src3),<br>
- Mask, PassThru, Subtarget, DAG);<br>
- }<br>
case IFMA_OP:<br>
// NOTE: We need to swizzle the operands to pass the multiply operands<br>
// first.<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86InstrAVX512.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrAVX512.td?rev=336871&r1=336870&r2=336871&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrAVX512.td?rev=336871&r1=336870&r2=336871&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86InstrAVX512.td (original)<br>
+++ llvm/trunk/lib/Target/X86/X86InstrAVX512.td Wed Jul 11 17:29:56 2018<br>
@@ -6826,6 +6826,13 @@ multiclass avx512_scalar_fma_patterns<SD<br>
(COPY_TO_REGCLASS _.FRC:$src3, VR128X))>;<br>
<br>
def : Pat<(_.VT (Move (_.VT VR128X:$src1), (_.VT (scalar_to_vector<br>
+ (Op _.FRC:$src2, _.FRC:$src3,<br>
+ (_.EltVT (extractelt (_.VT VR128X:$src1), (iPTR 0)))))))),<br>
+ (!cast<I>(Prefix#"231"#Suffix#"Zr_Int")<br>
+ VR128X:$src1, (COPY_TO_REGCLASS _.FRC:$src2, VR128X),<br>
+ (COPY_TO_REGCLASS _.FRC:$src3, VR128X))>;<br>
+<br>
+ def : Pat<(_.VT (Move (_.VT VR128X:$src1), (_.VT (scalar_to_vector<br>
(Op _.FRC:$src2,<br>
(_.EltVT (extractelt (_.VT VR128X:$src1), (iPTR 0))),<br>
(_.ScalarLdFrag addr:$src3)))))),<br>
@@ -6841,6 +6848,13 @@ multiclass avx512_scalar_fma_patterns<SD<br>
addr:$src3)>;<br>
<br>
def : Pat<(_.VT (Move (_.VT VR128X:$src1), (_.VT (scalar_to_vector<br>
+ (Op _.FRC:$src2, (_.ScalarLdFrag addr:$src3),<br>
+ (_.EltVT (extractelt (_.VT VR128X:$src1), (iPTR 0)))))))),<br>
+ (!cast<I>(Prefix#"231"#Suffix#"Zm_Int")<br>
+ VR128X:$src1, (COPY_TO_REGCLASS _.FRC:$src2, VR128X),<br>
+ addr:$src3)>;<br>
+<br>
+ def : Pat<(_.VT (Move (_.VT VR128X:$src1), (_.VT (scalar_to_vector<br>
(X86selects VK1WM:$mask,<br>
(Op _.FRC:$src2,<br>
(_.EltVT (extractelt (_.VT VR128X:$src1), (iPTR 0))),<br>
@@ -6947,6 +6961,14 @@ multiclass avx512_scalar_fma_patterns<SD<br>
VR128X:$src1, (COPY_TO_REGCLASS _.FRC:$src2, VR128X),<br>
(COPY_TO_REGCLASS _.FRC:$src3, VR128X), imm:$rc)>;<br>
<br>
+ def : Pat<(_.VT (Move (_.VT VR128X:$src1), (_.VT (scalar_to_vector<br>
+ (RndOp _.FRC:$src2, _.FRC:$src3,<br>
+ (_.EltVT (extractelt (_.VT VR128X:$src1), (iPTR 0))),<br>
+ (i32 imm:$rc)))))),<br>
+ (!cast<I>(Prefix#"231"#Suffix#"Zrb_Int")<br>
+ VR128X:$src1, (COPY_TO_REGCLASS _.FRC:$src2, VR128X),<br>
+ (COPY_TO_REGCLASS _.FRC:$src3, VR128X), imm:$rc)>;<br>
+<br>
def : Pat<(_.VT (Move (_.VT VR128X:$src1), (_.VT (scalar_to_vector<br>
(X86selects VK1WM:$mask,<br>
(RndOp _.FRC:$src2,<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86InstrFMA.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFMA.td?rev=336871&r1=336870&r2=336871&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFMA.td?rev=336871&r1=336870&r2=336871&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86InstrFMA.td (original)<br>
+++ llvm/trunk/lib/Target/X86/X86InstrFMA.td Wed Jul 11 17:29:56 2018<br>
@@ -355,6 +355,13 @@ multiclass scalar_fma_patterns<SDNode Op<br>
(!cast<Instruction>(Prefix#"132"#Suffix#"m_Int")<br>
VR128:$src1, (COPY_TO_REGCLASS RC:$src2, VR128),<br>
addr:$src3)>;<br>
+<br>
+ def : Pat<(VT (Move (VT VR128:$src1), (VT (scalar_to_vector<br>
+ (Op RC:$src2, (mem_frag addr:$src3),<br>
+ (EltVT (extractelt (VT VR128:$src1), (iPTR 0)))))))),<br>
+ (!cast<Instruction>(Prefix#"231"#Suffix#"m_Int")<br>
+ VR128:$src1, (COPY_TO_REGCLASS RC:$src2, VR128),<br>
+ addr:$src3)>;<br>
}<br>
}<br>
<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86IntrinsicsInfo.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86IntrinsicsInfo.h?rev=336871&r1=336870&r2=336871&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86IntrinsicsInfo.h?rev=336871&r1=336870&r2=336871&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86IntrinsicsInfo.h (original)<br>
+++ llvm/trunk/lib/Target/X86/X86IntrinsicsInfo.h Wed Jul 11 17:29:56 2018<br>
@@ -28,8 +28,7 @@ enum IntrinsicType : uint16_t {<br>
INTR_TYPE_1OP_MASK, INTR_TYPE_1OP_MASK_RM,<br>
INTR_TYPE_2OP_MASK, INTR_TYPE_2OP_MASK_RM,<br>
INTR_TYPE_3OP_MASK,<br>
- FMA_OP_MASK, FMA_OP_MASKZ,<br>
- FMA_OP_SCALAR_MASK, FMA_OP_SCALAR_MASKZ, FMA_OP_SCALAR_MASK3,<br>
+ FMA_OP_MASK, FMA_OP_MASKZ, FMA_OP_SCALAR,<br>
IFMA_OP, VPERM_2OP, INTR_TYPE_SCALAR_MASK,<br>
INTR_TYPE_SCALAR_MASK_RM, INTR_TYPE_3OP_SCALAR_MASK,<br>
COMPRESS_EXPAND_IN_REG,<br>
@@ -879,9 +878,6 @@ static const IntrinsicData IntrinsicsWi<br>
X86_INTRINSIC_DATA(avx512_mask_vcvtps2ph_512, INTR_TYPE_2OP_MASK,<br>
X86ISD::CVTPS2PH, 0),<br>
<br>
- X86_INTRINSIC_DATA(avx512_mask_vfmadd_sd, FMA_OP_SCALAR_MASK, X86ISD::FMADDS1, X86ISD::FMADDS1_RND),<br>
- X86_INTRINSIC_DATA(avx512_mask_vfmadd_ss, FMA_OP_SCALAR_MASK, X86ISD::FMADDS1, X86ISD::FMADDS1_RND),<br>
-<br>
X86_INTRINSIC_DATA(avx512_mask_vpshldv_d_128, FMA_OP_MASK, X86ISD::VSHLDV, 0),<br>
X86_INTRINSIC_DATA(avx512_mask_vpshldv_d_256, FMA_OP_MASK, X86ISD::VSHLDV, 0),<br>
X86_INTRINSIC_DATA(avx512_mask_vpshldv_d_512, FMA_OP_MASK, X86ISD::VSHLDV, 0),<br>
@@ -908,14 +904,6 @@ static const IntrinsicData IntrinsicsWi<br>
X86_INTRINSIC_DATA(avx512_mask_vpshufbitqmb_512, CMP_MASK,<br>
X86ISD::VPSHUFBITQMB, 0),<br>
<br>
- X86_INTRINSIC_DATA(avx512_mask3_vfmadd_sd, FMA_OP_SCALAR_MASK3, X86ISD::FMADDS3, X86ISD::FMADDS3_RND),<br>
- X86_INTRINSIC_DATA(avx512_mask3_vfmadd_ss, FMA_OP_SCALAR_MASK3, X86ISD::FMADDS3, X86ISD::FMADDS3_RND),<br>
-<br>
- X86_INTRINSIC_DATA(avx512_mask3_vfmsub_sd, FMA_OP_SCALAR_MASK3, X86ISD::FMSUBS3, X86ISD::FMSUBS3_RND),<br>
- X86_INTRINSIC_DATA(avx512_mask3_vfmsub_ss, FMA_OP_SCALAR_MASK3, X86ISD::FMSUBS3, X86ISD::FMSUBS3_RND),<br>
-<br>
- X86_INTRINSIC_DATA(avx512_mask3_vfnmsub_sd, FMA_OP_SCALAR_MASK3, X86ISD::FNMSUBS3, X86ISD::FNMSUBS3_RND),<br>
- X86_INTRINSIC_DATA(avx512_mask3_vfnmsub_ss, FMA_OP_SCALAR_MASK3, X86ISD::FNMSUBS3, X86ISD::FNMSUBS3_RND),<br>
X86_INTRINSIC_DATA(avx512_maskz_fixupimm_pd_128, FIXUPIMM_MASKZ,<br>
X86ISD::VFIXUPIMM, 0),<br>
X86_INTRINSIC_DATA(avx512_maskz_fixupimm_pd_256, FIXUPIMM_MASKZ,<br>
@@ -933,9 +921,6 @@ static const IntrinsicData IntrinsicsWi<br>
X86_INTRINSIC_DATA(avx512_maskz_fixupimm_ss, FIXUPIMMS_MASKZ,<br>
X86ISD::VFIXUPIMMS, 0),<br>
<br>
- X86_INTRINSIC_DATA(avx512_maskz_vfmadd_sd, FMA_OP_SCALAR_MASKZ, X86ISD::FMADDS1, X86ISD::FMADDS1_RND),<br>
- X86_INTRINSIC_DATA(avx512_maskz_vfmadd_ss, FMA_OP_SCALAR_MASKZ, X86ISD::FMADDS1, X86ISD::FMADDS1_RND),<br>
-<br>
X86_INTRINSIC_DATA(avx512_maskz_vpshldv_d_128, FMA_OP_MASKZ, X86ISD::VSHLDV, 0),<br>
X86_INTRINSIC_DATA(avx512_maskz_vpshldv_d_256, FMA_OP_MASKZ, X86ISD::VSHLDV, 0),<br>
X86_INTRINSIC_DATA(avx512_maskz_vpshldv_d_512, FMA_OP_MASKZ, X86ISD::VSHLDV, 0),<br>
<br>
Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp?rev=336871&r1=336870&r2=336871&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp?rev=336871&r1=336870&r2=336871&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp (original)<br>
+++ llvm/trunk/lib/Transforms/InstCombine/InstCombineCalls.cpp Wed Jul 11 17:29:56 2018<br>
@@ -2535,16 +2535,6 @@ Instruction *InstCombiner::visitCallInst<br>
case Intrinsic::x86_avx512_mask_min_ss_round:<br>
case Intrinsic::x86_avx512_mask_max_sd_round:<br>
case Intrinsic::x86_avx512_mask_min_sd_round:<br>
- case Intrinsic::x86_avx512_mask_vfmadd_ss:<br>
- case Intrinsic::x86_avx512_mask_vfmadd_sd:<br>
- case Intrinsic::x86_avx512_maskz_vfmadd_ss:<br>
- case Intrinsic::x86_avx512_maskz_vfmadd_sd:<br>
- case Intrinsic::x86_avx512_mask3_vfmadd_ss:<br>
- case Intrinsic::x86_avx512_mask3_vfmadd_sd:<br>
- case Intrinsic::x86_avx512_mask3_vfmsub_ss:<br>
- case Intrinsic::x86_avx512_mask3_vfmsub_sd:<br>
- case Intrinsic::x86_avx512_mask3_vfnmsub_ss:<br>
- case Intrinsic::x86_avx512_mask3_vfnmsub_sd:<br>
case Intrinsic::x86_sse_cmp_ss:<br>
case Intrinsic::x86_sse_min_ss:<br>
case Intrinsic::x86_sse_max_ss:<br>
<br>
Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp?rev=336871&r1=336870&r2=336871&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp?rev=336871&r1=336870&r2=336871&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp (original)<br>
+++ llvm/trunk/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp Wed Jul 11 17:29:56 2018<br>
@@ -1497,10 +1497,6 @@ Value *InstCombiner::SimplifyDemandedVec<br>
case Intrinsic::x86_avx512_mask_sub_sd_round:<br>
case Intrinsic::x86_avx512_mask_max_sd_round:<br>
case Intrinsic::x86_avx512_mask_min_sd_round:<br>
- case Intrinsic::x86_avx512_mask_vfmadd_ss:<br>
- case Intrinsic::x86_avx512_mask_vfmadd_sd:<br>
- case Intrinsic::x86_avx512_maskz_vfmadd_ss:<br>
- case Intrinsic::x86_avx512_maskz_vfmadd_sd:<br>
TmpV = SimplifyDemandedVectorElts(II->getArgOperand(0), DemandedElts,<br>
UndefElts, Depth + 1);<br>
if (TmpV) { II->setArgOperand(0, TmpV); MadeChange = true; }<br>
@@ -1522,39 +1518,6 @@ Value *InstCombiner::SimplifyDemandedVec<br>
<br>
// Lower element is undefined if all three lower elements are undefined.<br>
// Consider things like undef&0. The result is known zero, not undef.<br>
- if (!UndefElts2[0] || !UndefElts3[0])<br>
- UndefElts.clearBit(0);<br>
-<br>
- break;<br>
-<br>
- case Intrinsic::x86_avx512_mask3_vfmadd_ss:<br>
- case Intrinsic::x86_avx512_mask3_vfmadd_sd:<br>
- case Intrinsic::x86_avx512_mask3_vfmsub_ss:<br>
- case Intrinsic::x86_avx512_mask3_vfmsub_sd:<br>
- case Intrinsic::x86_avx512_mask3_vfnmsub_ss:<br>
- case Intrinsic::x86_avx512_mask3_vfnmsub_sd:<br>
- // These intrinsics get the passthru bits from operand 2.<br>
- TmpV = SimplifyDemandedVectorElts(II->getArgOperand(2), DemandedElts,<br>
- UndefElts, Depth + 1);<br>
- if (TmpV) { II->setArgOperand(2, TmpV); MadeChange = true; }<br>
-<br>
- // If lowest element of a scalar op isn't used then use Arg2.<br>
- if (!DemandedElts[0]) {<br>
- Worklist.Add(II);<br>
- return II->getArgOperand(2);<br>
- }<br>
-<br>
- // Only lower element is used for operand 0 and 1.<br>
- DemandedElts = 1;<br>
- TmpV = SimplifyDemandedVectorElts(II->getArgOperand(0), DemandedElts,<br>
- UndefElts2, Depth + 1);<br>
- if (TmpV) { II->setArgOperand(0, TmpV); MadeChange = true; }<br>
- TmpV = SimplifyDemandedVectorElts(II->getArgOperand(1), DemandedElts,<br>
- UndefElts3, Depth + 1);<br>
- if (TmpV) { II->setArgOperand(1, TmpV); MadeChange = true; }<br>
-<br>
- // Lower element is undefined if all three lower elements are undefined.<br>
- // Consider things like undef&0. The result is known zero, not undef.<br>
if (!UndefElts2[0] || !UndefElts3[0])<br>
UndefElts.clearBit(0);<br>
<br>
<br>
Modified: llvm/trunk/test/CodeGen/X86/avx512-intrinsics-upgrade.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-intrinsics-upgrade.ll?rev=336871&r1=336870&r2=336871&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-intrinsics-upgrade.ll?rev=336871&r1=336870&r2=336871&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/X86/avx512-intrinsics-upgrade.ll (original)<br>
+++ llvm/trunk/test/CodeGen/X86/avx512-intrinsics-upgrade.ll Wed Jul 11 17:29:56 2018<br>
@@ -8784,3 +8784,670 @@ define <8 x i64>@test_int_x86_avx512_mas<br>
%res4 = add <8 x i64> %res3, %res2<br>
ret <8 x i64> %res4<br>
}<br>
+<br>
+declare <2 x double> @<a href="http://llvm.x86.avx512.mask.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask.vfmadd.sd</a>(<2 x double>, <2 x double>, <2 x double>, i8, i32)<br>
+<br>
+define <2 x double>@test_int_x86_avx512_mask_vfmadd_sd(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3,i32 %x4 ){<br>
+; X86-LABEL: test_int_x86_avx512_mask_vfmadd_sd:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %al ## encoding: [0x8a,0x44,0x24,0x04]<br>
+; X86-NEXT: vmovapd %xmm0, %xmm3 ## encoding: [0xc5,0xf9,0x28,0xd8]<br>
+; X86-NEXT: vfmadd213sd %xmm2, %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf1,0xa9,0xda]<br>
+; X86-NEXT: ## xmm3 = (xmm1 * xmm3) + xmm2<br>
+; X86-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]<br>
+; X86-NEXT: vmovapd %xmm0, %xmm4 ## encoding: [0xc5,0xf9,0x28,0xe0]<br>
+; X86-NEXT: vfmadd213sd %xmm2, %xmm1, %xmm4 {%k1} ## encoding: [0x62,0xf2,0xf5,0x09,0xa9,0xe2]<br>
+; X86-NEXT: ## xmm4 = (xmm1 * xmm4) + xmm2<br>
+; X86-NEXT: vmovapd %xmm0, %xmm5 ## encoding: [0xc5,0xf9,0x28,0xe8]<br>
+; X86-NEXT: vfmadd213sd {rz-sae}, %xmm2, %xmm1, %xmm5 ## encoding: [0x62,0xf2,0xf5,0x78,0xa9,0xea]<br>
+; X86-NEXT: vfmadd213sd {rz-sae}, %xmm2, %xmm1, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xf5,0x79,0xa9,0xc2]<br>
+; X86-NEXT: vaddpd %xmm4, %xmm3, %xmm1 ## encoding: [0xc5,0xe1,0x58,0xcc]<br>
+; X86-NEXT: vaddpd %xmm0, %xmm5, %xmm0 ## encoding: [0xc5,0xd1,0x58,0xc0]<br>
+; X86-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## encoding: [0xc5,0xf1,0x58,0xc0]<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: test_int_x86_avx512_mask_vfmadd_sd:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: vmovapd %xmm0, %xmm3 ## encoding: [0xc5,0xf9,0x28,0xd8]<br>
+; X64-NEXT: vfmadd213sd %xmm2, %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf1,0xa9,0xda]<br>
+; X64-NEXT: ## xmm3 = (xmm1 * xmm3) + xmm2<br>
+; X64-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]<br>
+; X64-NEXT: vmovapd %xmm0, %xmm4 ## encoding: [0xc5,0xf9,0x28,0xe0]<br>
+; X64-NEXT: vfmadd213sd %xmm2, %xmm1, %xmm4 {%k1} ## encoding: [0x62,0xf2,0xf5,0x09,0xa9,0xe2]<br>
+; X64-NEXT: ## xmm4 = (xmm1 * xmm4) + xmm2<br>
+; X64-NEXT: vmovapd %xmm0, %xmm5 ## encoding: [0xc5,0xf9,0x28,0xe8]<br>
+; X64-NEXT: vfmadd213sd {rz-sae}, %xmm2, %xmm1, %xmm5 ## encoding: [0x62,0xf2,0xf5,0x78,0xa9,0xea]<br>
+; X64-NEXT: vfmadd213sd {rz-sae}, %xmm2, %xmm1, %xmm0 {%k1} ## encoding: [0x62,0xf2,0xf5,0x79,0xa9,0xc2]<br>
+; X64-NEXT: vaddpd %xmm4, %xmm3, %xmm1 ## encoding: [0xc5,0xe1,0x58,0xcc]<br>
+; X64-NEXT: vaddpd %xmm0, %xmm5, %xmm0 ## encoding: [0xc5,0xd1,0x58,0xc0]<br>
+; X64-NEXT: vaddpd %xmm0, %xmm1, %xmm0 ## encoding: [0xc5,0xf1,0x58,0xc0]<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %res = call <2 x double> @<a href="http://llvm.x86.avx512.mask.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1, i32 4)<br>
+ %res1 = call <2 x double> @<a href="http://llvm.x86.avx512.mask.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 4)<br>
+ %res2 = call <2 x double> @<a href="http://llvm.x86.avx512.mask.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1, i32 3)<br>
+ %res3 = call <2 x double> @<a href="http://llvm.x86.avx512.mask.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 3)<br>
+ %res4 = fadd <2 x double> %res, %res1<br>
+ %res5 = fadd <2 x double> %res2, %res3<br>
+ %res6 = fadd <2 x double> %res4, %res5<br>
+ ret <2 x double> %res6<br>
+}<br>
+<br>
+declare <4 x float> @llvm.x86.avx512.mask.vfmadd.ss(<4 x float>, <4 x float>, <4 x float>, i8, i32)<br>
+<br>
+define <4 x float>@test_int_x86_avx512_mask_vfmadd_ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3,i32 %x4 ){<br>
+; X86-LABEL: test_int_x86_avx512_mask_vfmadd_ss:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %al ## encoding: [0x8a,0x44,0x24,0x04]<br>
+; X86-NEXT: vmovaps %xmm0, %xmm3 ## encoding: [0xc5,0xf8,0x28,0xd8]<br>
+; X86-NEXT: vfmadd213ss %xmm2, %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x71,0xa9,0xda]<br>
+; X86-NEXT: ## xmm3 = (xmm1 * xmm3) + xmm2<br>
+; X86-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]<br>
+; X86-NEXT: vmovaps %xmm0, %xmm4 ## encoding: [0xc5,0xf8,0x28,0xe0]<br>
+; X86-NEXT: vfmadd213ss %xmm2, %xmm1, %xmm4 {%k1} ## encoding: [0x62,0xf2,0x75,0x09,0xa9,0xe2]<br>
+; X86-NEXT: ## xmm4 = (xmm1 * xmm4) + xmm2<br>
+; X86-NEXT: vmovaps %xmm0, %xmm5 ## encoding: [0xc5,0xf8,0x28,0xe8]<br>
+; X86-NEXT: vfmadd213ss {rz-sae}, %xmm2, %xmm1, %xmm5 ## encoding: [0x62,0xf2,0x75,0x78,0xa9,0xea]<br>
+; X86-NEXT: vfmadd213ss {rz-sae}, %xmm2, %xmm1, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x75,0x79,0xa9,0xc2]<br>
+; X86-NEXT: vaddps %xmm4, %xmm3, %xmm1 ## encoding: [0xc5,0xe0,0x58,0xcc]<br>
+; X86-NEXT: vaddps %xmm0, %xmm5, %xmm0 ## encoding: [0xc5,0xd0,0x58,0xc0]<br>
+; X86-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0xc5,0xf0,0x58,0xc0]<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: test_int_x86_avx512_mask_vfmadd_ss:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: vmovaps %xmm0, %xmm3 ## encoding: [0xc5,0xf8,0x28,0xd8]<br>
+; X64-NEXT: vfmadd213ss %xmm2, %xmm1, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x71,0xa9,0xda]<br>
+; X64-NEXT: ## xmm3 = (xmm1 * xmm3) + xmm2<br>
+; X64-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]<br>
+; X64-NEXT: vmovaps %xmm0, %xmm4 ## encoding: [0xc5,0xf8,0x28,0xe0]<br>
+; X64-NEXT: vfmadd213ss %xmm2, %xmm1, %xmm4 {%k1} ## encoding: [0x62,0xf2,0x75,0x09,0xa9,0xe2]<br>
+; X64-NEXT: ## xmm4 = (xmm1 * xmm4) + xmm2<br>
+; X64-NEXT: vmovaps %xmm0, %xmm5 ## encoding: [0xc5,0xf8,0x28,0xe8]<br>
+; X64-NEXT: vfmadd213ss {rz-sae}, %xmm2, %xmm1, %xmm5 ## encoding: [0x62,0xf2,0x75,0x78,0xa9,0xea]<br>
+; X64-NEXT: vfmadd213ss {rz-sae}, %xmm2, %xmm1, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x75,0x79,0xa9,0xc2]<br>
+; X64-NEXT: vaddps %xmm4, %xmm3, %xmm1 ## encoding: [0xc5,0xe0,0x58,0xcc]<br>
+; X64-NEXT: vaddps %xmm0, %xmm5, %xmm0 ## encoding: [0xc5,0xd0,0x58,0xc0]<br>
+; X64-NEXT: vaddps %xmm0, %xmm1, %xmm0 ## encoding: [0xc5,0xf0,0x58,0xc0]<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1, i32 4)<br>
+ %res1 = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3, i32 4)<br>
+ %res2 = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1, i32 3)<br>
+ %res3 = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3, i32 3)<br>
+ %res4 = fadd <4 x float> %res, %res1<br>
+ %res5 = fadd <4 x float> %res2, %res3<br>
+ %res6 = fadd <4 x float> %res4, %res5<br>
+ ret <4 x float> %res6<br>
+}<br>
+<br>
+declare <2 x double> @<a href="http://llvm.x86.avx512.maskz.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.maskz.vfmadd.sd</a>(<2 x double>, <2 x double>, <2 x double>, i8, i32)<br>
+<br>
+define <2 x double>@test_int_x86_avx512_maskz_vfmadd_sd(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3,i32 %x4 ){<br>
+; X86-LABEL: test_int_x86_avx512_maskz_vfmadd_sd:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %al ## encoding: [0x8a,0x44,0x24,0x04]<br>
+; X86-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]<br>
+; X86-NEXT: vmovapd %xmm0, %xmm3 ## encoding: [0xc5,0xf9,0x28,0xd8]<br>
+; X86-NEXT: vfmadd213sd %xmm2, %xmm1, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xf5,0x89,0xa9,0xda]<br>
+; X86-NEXT: ## xmm3 = (xmm1 * xmm3) + xmm2<br>
+; X86-NEXT: vfmadd213sd {rz-sae}, %xmm2, %xmm1, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xf5,0xf9,0xa9,0xc2]<br>
+; X86-NEXT: vaddpd %xmm0, %xmm3, %xmm0 ## encoding: [0xc5,0xe1,0x58,0xc0]<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: test_int_x86_avx512_maskz_vfmadd_sd:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]<br>
+; X64-NEXT: vmovapd %xmm0, %xmm3 ## encoding: [0xc5,0xf9,0x28,0xd8]<br>
+; X64-NEXT: vfmadd213sd %xmm2, %xmm1, %xmm3 {%k1} {z} ## encoding: [0x62,0xf2,0xf5,0x89,0xa9,0xda]<br>
+; X64-NEXT: ## xmm3 = (xmm1 * xmm3) + xmm2<br>
+; X64-NEXT: vfmadd213sd {rz-sae}, %xmm2, %xmm1, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0xf5,0xf9,0xa9,0xc2]<br>
+; X64-NEXT: vaddpd %xmm0, %xmm3, %xmm0 ## encoding: [0xc5,0xe1,0x58,0xc0]<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %res = call <2 x double> @<a href="http://llvm.x86.avx512.maskz.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.maskz.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 4)<br>
+ %res1 = call <2 x double> @<a href="http://llvm.x86.avx512.maskz.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.maskz.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 3)<br>
+ %res2 = fadd <2 x double> %res, %res1<br>
+ ret <2 x double> %res2<br>
+}<br>
+<br>
+declare <4 x float> @llvm.x86.avx512.maskz.vfmadd.ss(<4 x float>, <4 x float>, <4 x float>, i8, i32)<br>
+<br>
+define <4 x float>@test_int_x86_avx512_maskz_vfmadd_ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3,i32 %x4 ){<br>
+; X86-LABEL: test_int_x86_avx512_maskz_vfmadd_ss:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %al ## encoding: [0x8a,0x44,0x24,0x04]<br>
+; X86-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]<br>
+; X86-NEXT: vfmadd213ss %xmm2, %xmm1, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x75,0x89,0xa9,0xc2]<br>
+; X86-NEXT: ## xmm0 = (xmm1 * xmm0) + xmm2<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: test_int_x86_avx512_maskz_vfmadd_ss:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]<br>
+; X64-NEXT: vfmadd213ss %xmm2, %xmm1, %xmm0 {%k1} {z} ## encoding: [0x62,0xf2,0x75,0x89,0xa9,0xc2]<br>
+; X64-NEXT: ## xmm0 = (xmm1 * xmm0) + xmm2<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %res = call <4 x float> @llvm.x86.avx512.maskz.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3, i32 4)<br>
+ %res1 = call <4 x float> @llvm.x86.avx512.maskz.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3, i32 3)<br>
+ %res2 = fadd <4 x float> %res, %res1<br>
+ ret <4 x float> %res<br>
+}<br>
+declare <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmadd.sd</a>(<2 x double>, <2 x double>, <2 x double>, i8, i32)<br>
+<br>
+define <2 x double>@test_int_x86_avx512_mask3_vfmadd_sd(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3,i32 %x4 ){<br>
+; X86-LABEL: test_int_x86_avx512_mask3_vfmadd_sd:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %al ## encoding: [0x8a,0x44,0x24,0x04]<br>
+; X86-NEXT: vmovapd %xmm2, %xmm3 ## encoding: [0xc5,0xf9,0x28,0xda]<br>
+; X86-NEXT: vfmadd231sd %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xb9,0xd9]<br>
+; X86-NEXT: ## xmm3 = (xmm0 * xmm1) + xmm3<br>
+; X86-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]<br>
+; X86-NEXT: vmovapd %xmm2, %xmm4 ## encoding: [0xc5,0xf9,0x28,0xe2]<br>
+; X86-NEXT: vfmadd231sd %xmm1, %xmm0, %xmm4 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xb9,0xe1]<br>
+; X86-NEXT: ## xmm4 = (xmm0 * xmm1) + xmm4<br>
+; X86-NEXT: vmovapd %xmm2, %xmm5 ## encoding: [0xc5,0xf9,0x28,0xea]<br>
+; X86-NEXT: vfmadd231sd {rz-sae}, %xmm1, %xmm0, %xmm5 ## encoding: [0x62,0xf2,0xfd,0x78,0xb9,0xe9]<br>
+; X86-NEXT: vfmadd231sd {rz-sae}, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x79,0xb9,0xd1]<br>
+; X86-NEXT: vaddpd %xmm4, %xmm3, %xmm0 ## encoding: [0xc5,0xe1,0x58,0xc4]<br>
+; X86-NEXT: vaddpd %xmm2, %xmm5, %xmm1 ## encoding: [0xc5,0xd1,0x58,0xca]<br>
+; X86-NEXT: vaddpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x58,0xc1]<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: test_int_x86_avx512_mask3_vfmadd_sd:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: vmovapd %xmm2, %xmm3 ## encoding: [0xc5,0xf9,0x28,0xda]<br>
+; X64-NEXT: vfmadd231sd %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xb9,0xd9]<br>
+; X64-NEXT: ## xmm3 = (xmm0 * xmm1) + xmm3<br>
+; X64-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]<br>
+; X64-NEXT: vmovapd %xmm2, %xmm4 ## encoding: [0xc5,0xf9,0x28,0xe2]<br>
+; X64-NEXT: vfmadd231sd %xmm1, %xmm0, %xmm4 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xb9,0xe1]<br>
+; X64-NEXT: ## xmm4 = (xmm0 * xmm1) + xmm4<br>
+; X64-NEXT: vmovapd %xmm2, %xmm5 ## encoding: [0xc5,0xf9,0x28,0xea]<br>
+; X64-NEXT: vfmadd231sd {rz-sae}, %xmm1, %xmm0, %xmm5 ## encoding: [0x62,0xf2,0xfd,0x78,0xb9,0xe9]<br>
+; X64-NEXT: vfmadd231sd {rz-sae}, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x79,0xb9,0xd1]<br>
+; X64-NEXT: vaddpd %xmm4, %xmm3, %xmm0 ## encoding: [0xc5,0xe1,0x58,0xc4]<br>
+; X64-NEXT: vaddpd %xmm2, %xmm5, %xmm1 ## encoding: [0xc5,0xd1,0x58,0xca]<br>
+; X64-NEXT: vaddpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x58,0xc1]<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %res = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1, i32 4)<br>
+ %res1 = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 4)<br>
+ %res2 = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1, i32 3)<br>
+ %res3 = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 3)<br>
+ %res4 = fadd <2 x double> %res, %res1<br>
+ %res5 = fadd <2 x double> %res2, %res3<br>
+ %res6 = fadd <2 x double> %res4, %res5<br>
+ ret <2 x double> %res6<br>
+}<br>
+<br>
+declare <4 x float> @llvm.x86.avx512.mask3.vfmadd.ss(<4 x float>, <4 x float>, <4 x float>, i8, i32)<br>
+<br>
+define <4 x float>@test_int_x86_avx512_mask3_vfmadd_ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3,i32 %x4 ){<br>
+; X86-LABEL: test_int_x86_avx512_mask3_vfmadd_ss:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %al ## encoding: [0x8a,0x44,0x24,0x04]<br>
+; X86-NEXT: vmovaps %xmm2, %xmm3 ## encoding: [0xc5,0xf8,0x28,0xda]<br>
+; X86-NEXT: vfmadd231ss %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xb9,0xd9]<br>
+; X86-NEXT: ## xmm3 = (xmm0 * xmm1) + xmm3<br>
+; X86-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]<br>
+; X86-NEXT: vmovaps %xmm2, %xmm4 ## encoding: [0xc5,0xf8,0x28,0xe2]<br>
+; X86-NEXT: vfmadd231ss %xmm1, %xmm0, %xmm4 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xb9,0xe1]<br>
+; X86-NEXT: ## xmm4 = (xmm0 * xmm1) + xmm4<br>
+; X86-NEXT: vmovaps %xmm2, %xmm5 ## encoding: [0xc5,0xf8,0x28,0xea]<br>
+; X86-NEXT: vfmadd231ss {rz-sae}, %xmm1, %xmm0, %xmm5 ## encoding: [0x62,0xf2,0x7d,0x78,0xb9,0xe9]<br>
+; X86-NEXT: vfmadd231ss {rz-sae}, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x79,0xb9,0xd1]<br>
+; X86-NEXT: vaddps %xmm4, %xmm3, %xmm0 ## encoding: [0xc5,0xe0,0x58,0xc4]<br>
+; X86-NEXT: vaddps %xmm2, %xmm5, %xmm1 ## encoding: [0xc5,0xd0,0x58,0xca]<br>
+; X86-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x58,0xc1]<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: test_int_x86_avx512_mask3_vfmadd_ss:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: vmovaps %xmm2, %xmm3 ## encoding: [0xc5,0xf8,0x28,0xda]<br>
+; X64-NEXT: vfmadd231ss %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xb9,0xd9]<br>
+; X64-NEXT: ## xmm3 = (xmm0 * xmm1) + xmm3<br>
+; X64-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]<br>
+; X64-NEXT: vmovaps %xmm2, %xmm4 ## encoding: [0xc5,0xf8,0x28,0xe2]<br>
+; X64-NEXT: vfmadd231ss %xmm1, %xmm0, %xmm4 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xb9,0xe1]<br>
+; X64-NEXT: ## xmm4 = (xmm0 * xmm1) + xmm4<br>
+; X64-NEXT: vmovaps %xmm2, %xmm5 ## encoding: [0xc5,0xf8,0x28,0xea]<br>
+; X64-NEXT: vfmadd231ss {rz-sae}, %xmm1, %xmm0, %xmm5 ## encoding: [0x62,0xf2,0x7d,0x78,0xb9,0xe9]<br>
+; X64-NEXT: vfmadd231ss {rz-sae}, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x79,0xb9,0xd1]<br>
+; X64-NEXT: vaddps %xmm4, %xmm3, %xmm0 ## encoding: [0xc5,0xe0,0x58,0xc4]<br>
+; X64-NEXT: vaddps %xmm2, %xmm5, %xmm1 ## encoding: [0xc5,0xd0,0x58,0xca]<br>
+; X64-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x58,0xc1]<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %res = call <4 x float> @llvm.x86.avx512.mask3.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1, i32 4)<br>
+ %res1 = call <4 x float> @llvm.x86.avx512.mask3.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3, i32 4)<br>
+ %res2 = call <4 x float> @llvm.x86.avx512.mask3.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1, i32 3)<br>
+ %res3 = call <4 x float> @llvm.x86.avx512.mask3.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3, i32 3)<br>
+ %res4 = fadd <4 x float> %res, %res1<br>
+ %res5 = fadd <4 x float> %res2, %res3<br>
+ %res6 = fadd <4 x float> %res4, %res5<br>
+ ret <4 x float> %res6<br>
+}<br>
+<br>
+define void @fmadd_ss_mask_memfold(float* %a, float* %b, i8 %c) {<br>
+; X86-LABEL: fmadd_ss_mask_memfold:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %al ## encoding: [0x8a,0x44,0x24,0x0c]<br>
+; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx ## encoding: [0x8b,0x4c,0x24,0x08]<br>
+; X86-NEXT: movl {{[0-9]+}}(%esp), %edx ## encoding: [0x8b,0x54,0x24,0x04]<br>
+; X86-NEXT: vmovss (%edx), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x10,0x02]<br>
+; X86-NEXT: ## xmm0 = mem[0],zero,zero,zero<br>
+; X86-NEXT: vmovss (%ecx), %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x10,0x09]<br>
+; X86-NEXT: ## xmm1 = mem[0],zero,zero,zero<br>
+; X86-NEXT: vfmadd213ss %xmm0, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xa9,0xc8]<br>
+; X86-NEXT: ## xmm1 = (xmm0 * xmm1) + xmm0<br>
+; X86-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]<br>
+; X86-NEXT: vmovss %xmm1, %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x10,0xc1]<br>
+; X86-NEXT: vmovss %xmm0, (%edx) ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x11,0x02]<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: fmadd_ss_mask_memfold:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: vmovss (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x10,0x07]<br>
+; X64-NEXT: ## xmm0 = mem[0],zero,zero,zero<br>
+; X64-NEXT: vmovss (%rsi), %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x10,0x0e]<br>
+; X64-NEXT: ## xmm1 = mem[0],zero,zero,zero<br>
+; X64-NEXT: vfmadd213ss %xmm0, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xa9,0xc8]<br>
+; X64-NEXT: ## xmm1 = (xmm0 * xmm1) + xmm0<br>
+; X64-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]<br>
+; X64-NEXT: vmovss %xmm1, %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x10,0xc1]<br>
+; X64-NEXT: vmovss %xmm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x11,0x07]<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %a.val = load float, float* %a<br>
+ %av0 = insertelement <4 x float> undef, float %a.val, i32 0<br>
+ %av1 = insertelement <4 x float> %av0, float 0.000000e+00, i32 1<br>
+ %av2 = insertelement <4 x float> %av1, float 0.000000e+00, i32 2<br>
+ %av = insertelement <4 x float> %av2, float 0.000000e+00, i32 3<br>
+<br>
+ %b.val = load float, float* %b<br>
+ %bv0 = insertelement <4 x float> undef, float %b.val, i32 0<br>
+ %bv1 = insertelement <4 x float> %bv0, float 0.000000e+00, i32 1<br>
+ %bv2 = insertelement <4 x float> %bv1, float 0.000000e+00, i32 2<br>
+ %bv = insertelement <4 x float> %bv2, float 0.000000e+00, i32 3<br>
+<br>
+ %vr = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ss(<4 x float> %av, <4 x float> %bv, <4 x float> %av, i8 %c, i32 4)<br>
+<br>
+ %sr = extractelement <4 x float> %vr, i32 0<br>
+ store float %sr, float* %a<br>
+ ret void<br>
+}<br>
+<br>
+define void @fmadd_ss_maskz_memfold(float* %a, float* %b, i8 %c) {<br>
+; X86-LABEL: fmadd_ss_maskz_memfold:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %al ## encoding: [0x8a,0x44,0x24,0x0c]<br>
+; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx ## encoding: [0x8b,0x4c,0x24,0x08]<br>
+; X86-NEXT: movl {{[0-9]+}}(%esp), %edx ## encoding: [0x8b,0x54,0x24,0x04]<br>
+; X86-NEXT: vmovss (%edx), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x10,0x02]<br>
+; X86-NEXT: ## xmm0 = mem[0],zero,zero,zero<br>
+; X86-NEXT: vfmadd231ss (%ecx), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xb9,0x01]<br>
+; X86-NEXT: ## xmm0 = (xmm0 * mem) + xmm0<br>
+; X86-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]<br>
+; X86-NEXT: vxorps %xmm1, %xmm1, %xmm1 ## encoding: [0xc5,0xf0,0x57,0xc9]<br>
+; X86-NEXT: vmovss %xmm0, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x10,0xc8]<br>
+; X86-NEXT: vmovss %xmm1, (%edx) ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x11,0x0a]<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: fmadd_ss_maskz_memfold:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: vmovss (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x10,0x07]<br>
+; X64-NEXT: ## xmm0 = mem[0],zero,zero,zero<br>
+; X64-NEXT: vfmadd231ss (%rsi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xb9,0x06]<br>
+; X64-NEXT: ## xmm0 = (xmm0 * mem) + xmm0<br>
+; X64-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]<br>
+; X64-NEXT: vxorps %xmm1, %xmm1, %xmm1 ## encoding: [0xc5,0xf0,0x57,0xc9]<br>
+; X64-NEXT: vmovss %xmm0, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0x7e,0x09,0x10,0xc8]<br>
+; X64-NEXT: vmovss %xmm1, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfa,0x11,0x0f]<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %a.val = load float, float* %a<br>
+ %av0 = insertelement <4 x float> undef, float %a.val, i32 0<br>
+ %av1 = insertelement <4 x float> %av0, float 0.000000e+00, i32 1<br>
+ %av2 = insertelement <4 x float> %av1, float 0.000000e+00, i32 2<br>
+ %av = insertelement <4 x float> %av2, float 0.000000e+00, i32 3<br>
+<br>
+ %b.val = load float, float* %b<br>
+ %bv0 = insertelement <4 x float> undef, float %b.val, i32 0<br>
+ %bv1 = insertelement <4 x float> %bv0, float 0.000000e+00, i32 1<br>
+ %bv2 = insertelement <4 x float> %bv1, float 0.000000e+00, i32 2<br>
+ %bv = insertelement <4 x float> %bv2, float 0.000000e+00, i32 3<br>
+<br>
+ %vr = call <4 x float> @llvm.x86.avx512.maskz.vfmadd.ss(<4 x float> %av, <4 x float> %bv, <4 x float> %av, i8 %c, i32 4)<br>
+<br>
+ %sr = extractelement <4 x float> %vr, i32 0<br>
+ store float %sr, float* %a<br>
+ ret void<br>
+}<br>
+<br>
+define void @fmadd_sd_mask_memfold(double* %a, double* %b, i8 %c) {<br>
+; X86-LABEL: fmadd_sd_mask_memfold:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %al ## encoding: [0x8a,0x44,0x24,0x0c]<br>
+; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx ## encoding: [0x8b,0x4c,0x24,0x08]<br>
+; X86-NEXT: movl {{[0-9]+}}(%esp), %edx ## encoding: [0x8b,0x54,0x24,0x04]<br>
+; X86-NEXT: vmovsd (%edx), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x10,0x02]<br>
+; X86-NEXT: ## xmm0 = mem[0],zero<br>
+; X86-NEXT: vmovsd (%ecx), %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x10,0x09]<br>
+; X86-NEXT: ## xmm1 = mem[0],zero<br>
+; X86-NEXT: vfmadd213sd %xmm0, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xa9,0xc8]<br>
+; X86-NEXT: ## xmm1 = (xmm0 * xmm1) + xmm0<br>
+; X86-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]<br>
+; X86-NEXT: vmovsd %xmm1, %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0x10,0xc1]<br>
+; X86-NEXT: vmovsd %xmm0, (%edx) ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x11,0x02]<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: fmadd_sd_mask_memfold:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: vmovsd (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x10,0x07]<br>
+; X64-NEXT: ## xmm0 = mem[0],zero<br>
+; X64-NEXT: vmovsd (%rsi), %xmm1 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x10,0x0e]<br>
+; X64-NEXT: ## xmm1 = mem[0],zero<br>
+; X64-NEXT: vfmadd213sd %xmm0, %xmm0, %xmm1 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xa9,0xc8]<br>
+; X64-NEXT: ## xmm1 = (xmm0 * xmm1) + xmm0<br>
+; X64-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]<br>
+; X64-NEXT: vmovsd %xmm1, %xmm0, %xmm0 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0x10,0xc1]<br>
+; X64-NEXT: vmovsd %xmm0, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x11,0x07]<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %a.val = load double, double* %a<br>
+ %av0 = insertelement <2 x double> undef, double %a.val, i32 0<br>
+ %av = insertelement <2 x double> %av0, double 0.000000e+00, i32 1<br>
+<br>
+ %b.val = load double, double* %b<br>
+ %bv0 = insertelement <2 x double> undef, double %b.val, i32 0<br>
+ %bv = insertelement <2 x double> %bv0, double 0.000000e+00, i32 1<br>
+<br>
+ %vr = call <2 x double> @<a href="http://llvm.x86.avx512.mask.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask.vfmadd.sd</a>(<2 x double> %av, <2 x double> %bv, <2 x double> %av, i8 %c, i32 4)<br>
+<br>
+ %sr = extractelement <2 x double> %vr, i32 0<br>
+ store double %sr, double* %a<br>
+ ret void<br>
+}<br>
+<br>
+define void @fmadd_sd_maskz_memfold(double* %a, double* %b, i8 %c) {<br>
+; X86-LABEL: fmadd_sd_maskz_memfold:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %al ## encoding: [0x8a,0x44,0x24,0x0c]<br>
+; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx ## encoding: [0x8b,0x4c,0x24,0x08]<br>
+; X86-NEXT: movl {{[0-9]+}}(%esp), %edx ## encoding: [0x8b,0x54,0x24,0x04]<br>
+; X86-NEXT: vmovsd (%edx), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x10,0x02]<br>
+; X86-NEXT: ## xmm0 = mem[0],zero<br>
+; X86-NEXT: vfmadd231sd (%ecx), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xb9,0x01]<br>
+; X86-NEXT: ## xmm0 = (xmm0 * mem) + xmm0<br>
+; X86-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]<br>
+; X86-NEXT: vxorpd %xmm1, %xmm1, %xmm1 ## encoding: [0xc5,0xf1,0x57,0xc9]<br>
+; X86-NEXT: vmovsd %xmm0, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0x10,0xc8]<br>
+; X86-NEXT: vmovsd %xmm1, (%edx) ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x11,0x0a]<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: fmadd_sd_maskz_memfold:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: vmovsd (%rdi), %xmm0 ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x10,0x07]<br>
+; X64-NEXT: ## xmm0 = mem[0],zero<br>
+; X64-NEXT: vfmadd231sd (%rsi), %xmm0, %xmm0 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xb9,0x06]<br>
+; X64-NEXT: ## xmm0 = (xmm0 * mem) + xmm0<br>
+; X64-NEXT: kmovw %edx, %k1 ## encoding: [0xc5,0xf8,0x92,0xca]<br>
+; X64-NEXT: vxorpd %xmm1, %xmm1, %xmm1 ## encoding: [0xc5,0xf1,0x57,0xc9]<br>
+; X64-NEXT: vmovsd %xmm0, %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf1,0xff,0x09,0x10,0xc8]<br>
+; X64-NEXT: vmovsd %xmm1, (%rdi) ## EVEX TO VEX Compression encoding: [0xc5,0xfb,0x11,0x0f]<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %a.val = load double, double* %a<br>
+ %av0 = insertelement <2 x double> undef, double %a.val, i32 0<br>
+ %av = insertelement <2 x double> %av0, double 0.000000e+00, i32 1<br>
+<br>
+ %b.val = load double, double* %b<br>
+ %bv0 = insertelement <2 x double> undef, double %b.val, i32 0<br>
+ %bv = insertelement <2 x double> %bv0, double 0.000000e+00, i32 1<br>
+<br>
+ %vr = call <2 x double> @<a href="http://llvm.x86.avx512.maskz.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.maskz.vfmadd.sd</a>(<2 x double> %av, <2 x double> %bv, <2 x double> %av, i8 %c, i32 4)<br>
+<br>
+ %sr = extractelement <2 x double> %vr, i32 0<br>
+ store double %sr, double* %a<br>
+ ret void<br>
+}<br>
+<br>
+declare <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmsub.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmsub.sd</a>(<2 x double>, <2 x double>, <2 x double>, i8, i32)<br>
+<br>
+define <2 x double>@test_int_x86_avx512_mask3_vfmsub_sd(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3,i32 %x4 ){<br>
+; X86-LABEL: test_int_x86_avx512_mask3_vfmsub_sd:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %al ## encoding: [0x8a,0x44,0x24,0x04]<br>
+; X86-NEXT: vmovapd %xmm2, %xmm3 ## encoding: [0xc5,0xf9,0x28,0xda]<br>
+; X86-NEXT: vfmsub231sd %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xbb,0xd9]<br>
+; X86-NEXT: ## xmm3 = (xmm0 * xmm1) - xmm3<br>
+; X86-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]<br>
+; X86-NEXT: vmovapd %xmm2, %xmm4 ## encoding: [0xc5,0xf9,0x28,0xe2]<br>
+; X86-NEXT: vfmsub231sd %xmm1, %xmm0, %xmm4 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xbb,0xe1]<br>
+; X86-NEXT: ## xmm4 = (xmm0 * xmm1) - xmm4<br>
+; X86-NEXT: vmovapd %xmm2, %xmm5 ## encoding: [0xc5,0xf9,0x28,0xea]<br>
+; X86-NEXT: vfmsub231sd {rz-sae}, %xmm1, %xmm0, %xmm5 ## encoding: [0x62,0xf2,0xfd,0x78,0xbb,0xe9]<br>
+; X86-NEXT: vfmsub231sd {rz-sae}, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x79,0xbb,0xd1]<br>
+; X86-NEXT: vaddpd %xmm4, %xmm3, %xmm0 ## encoding: [0xc5,0xe1,0x58,0xc4]<br>
+; X86-NEXT: vaddpd %xmm2, %xmm5, %xmm1 ## encoding: [0xc5,0xd1,0x58,0xca]<br>
+; X86-NEXT: vaddpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x58,0xc1]<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: test_int_x86_avx512_mask3_vfmsub_sd:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: vmovapd %xmm2, %xmm3 ## encoding: [0xc5,0xf9,0x28,0xda]<br>
+; X64-NEXT: vfmsub231sd %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xbb,0xd9]<br>
+; X64-NEXT: ## xmm3 = (xmm0 * xmm1) - xmm3<br>
+; X64-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]<br>
+; X64-NEXT: vmovapd %xmm2, %xmm4 ## encoding: [0xc5,0xf9,0x28,0xe2]<br>
+; X64-NEXT: vfmsub231sd %xmm1, %xmm0, %xmm4 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xbb,0xe1]<br>
+; X64-NEXT: ## xmm4 = (xmm0 * xmm1) - xmm4<br>
+; X64-NEXT: vmovapd %xmm2, %xmm5 ## encoding: [0xc5,0xf9,0x28,0xea]<br>
+; X64-NEXT: vfmsub231sd {rz-sae}, %xmm1, %xmm0, %xmm5 ## encoding: [0x62,0xf2,0xfd,0x78,0xbb,0xe9]<br>
+; X64-NEXT: vfmsub231sd {rz-sae}, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x79,0xbb,0xd1]<br>
+; X64-NEXT: vaddpd %xmm4, %xmm3, %xmm0 ## encoding: [0xc5,0xe1,0x58,0xc4]<br>
+; X64-NEXT: vaddpd %xmm2, %xmm5, %xmm1 ## encoding: [0xc5,0xd1,0x58,0xca]<br>
+; X64-NEXT: vaddpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x58,0xc1]<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %res = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmsub.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmsub.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1, i32 4)<br>
+ %res1 = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmsub.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmsub.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 4)<br>
+ %res2 = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmsub.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmsub.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1, i32 3)<br>
+ %res3 = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmsub.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmsub.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 3)<br>
+ %res4 = fadd <2 x double> %res, %res1<br>
+ %res5 = fadd <2 x double> %res2, %res3<br>
+ %res6 = fadd <2 x double> %res4, %res5<br>
+ ret <2 x double> %res6<br>
+}<br>
+<br>
+declare <4 x float> @llvm.x86.avx512.mask3.vfmsub.ss(<4 x float>, <4 x float>, <4 x float>, i8, i32)<br>
+<br>
+define <4 x float>@test_int_x86_avx512_mask3_vfmsub_ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3,i32 %x4 ){<br>
+; X86-LABEL: test_int_x86_avx512_mask3_vfmsub_ss:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %al ## encoding: [0x8a,0x44,0x24,0x04]<br>
+; X86-NEXT: vmovaps %xmm2, %xmm3 ## encoding: [0xc5,0xf8,0x28,0xda]<br>
+; X86-NEXT: vfmsub231ss %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xbb,0xd9]<br>
+; X86-NEXT: ## xmm3 = (xmm0 * xmm1) - xmm3<br>
+; X86-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]<br>
+; X86-NEXT: vmovaps %xmm2, %xmm4 ## encoding: [0xc5,0xf8,0x28,0xe2]<br>
+; X86-NEXT: vfmsub231ss %xmm1, %xmm0, %xmm4 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xbb,0xe1]<br>
+; X86-NEXT: ## xmm4 = (xmm0 * xmm1) - xmm4<br>
+; X86-NEXT: vmovaps %xmm2, %xmm5 ## encoding: [0xc5,0xf8,0x28,0xea]<br>
+; X86-NEXT: vfmsub231ss {rz-sae}, %xmm1, %xmm0, %xmm5 ## encoding: [0x62,0xf2,0x7d,0x78,0xbb,0xe9]<br>
+; X86-NEXT: vfmsub231ss {rz-sae}, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x79,0xbb,0xd1]<br>
+; X86-NEXT: vaddps %xmm4, %xmm3, %xmm0 ## encoding: [0xc5,0xe0,0x58,0xc4]<br>
+; X86-NEXT: vaddps %xmm2, %xmm5, %xmm1 ## encoding: [0xc5,0xd0,0x58,0xca]<br>
+; X86-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x58,0xc1]<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: test_int_x86_avx512_mask3_vfmsub_ss:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: vmovaps %xmm2, %xmm3 ## encoding: [0xc5,0xf8,0x28,0xda]<br>
+; X64-NEXT: vfmsub231ss %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xbb,0xd9]<br>
+; X64-NEXT: ## xmm3 = (xmm0 * xmm1) - xmm3<br>
+; X64-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]<br>
+; X64-NEXT: vmovaps %xmm2, %xmm4 ## encoding: [0xc5,0xf8,0x28,0xe2]<br>
+; X64-NEXT: vfmsub231ss %xmm1, %xmm0, %xmm4 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xbb,0xe1]<br>
+; X64-NEXT: ## xmm4 = (xmm0 * xmm1) - xmm4<br>
+; X64-NEXT: vmovaps %xmm2, %xmm5 ## encoding: [0xc5,0xf8,0x28,0xea]<br>
+; X64-NEXT: vfmsub231ss {rz-sae}, %xmm1, %xmm0, %xmm5 ## encoding: [0x62,0xf2,0x7d,0x78,0xbb,0xe9]<br>
+; X64-NEXT: vfmsub231ss {rz-sae}, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x79,0xbb,0xd1]<br>
+; X64-NEXT: vaddps %xmm4, %xmm3, %xmm0 ## encoding: [0xc5,0xe0,0x58,0xc4]<br>
+; X64-NEXT: vaddps %xmm2, %xmm5, %xmm1 ## encoding: [0xc5,0xd0,0x58,0xca]<br>
+; X64-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x58,0xc1]<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %res = call <4 x float> @llvm.x86.avx512.mask3.vfmsub.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1, i32 4)<br>
+ %res1 = call <4 x float> @llvm.x86.avx512.mask3.vfmsub.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3, i32 4)<br>
+ %res2 = call <4 x float> @llvm.x86.avx512.mask3.vfmsub.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1, i32 3)<br>
+ %res3 = call <4 x float> @llvm.x86.avx512.mask3.vfmsub.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3, i32 3)<br>
+ %res4 = fadd <4 x float> %res, %res1<br>
+ %res5 = fadd <4 x float> %res2, %res3<br>
+ %res6 = fadd <4 x float> %res4, %res5<br>
+ ret <4 x float> %res6<br>
+}<br>
+<br>
+declare <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfnmsub.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfnmsub.sd</a>(<2 x double>, <2 x double>, <2 x double>, i8, i32)<br>
+<br>
+define <2 x double>@test_int_x86_avx512_mask3_vfnmsub_sd(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3,i32 %x4 ){<br>
+; X86-LABEL: test_int_x86_avx512_mask3_vfnmsub_sd:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %al ## encoding: [0x8a,0x44,0x24,0x04]<br>
+; X86-NEXT: vmovapd %xmm2, %xmm3 ## encoding: [0xc5,0xf9,0x28,0xda]<br>
+; X86-NEXT: vfnmsub231sd %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xbf,0xd9]<br>
+; X86-NEXT: ## xmm3 = -(xmm0 * xmm1) - xmm3<br>
+; X86-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]<br>
+; X86-NEXT: vmovapd %xmm2, %xmm4 ## encoding: [0xc5,0xf9,0x28,0xe2]<br>
+; X86-NEXT: vfnmsub231sd %xmm1, %xmm0, %xmm4 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xbf,0xe1]<br>
+; X86-NEXT: ## xmm4 = -(xmm0 * xmm1) - xmm4<br>
+; X86-NEXT: vmovapd %xmm2, %xmm5 ## encoding: [0xc5,0xf9,0x28,0xea]<br>
+; X86-NEXT: vfnmsub231sd {rz-sae}, %xmm1, %xmm0, %xmm5 ## encoding: [0x62,0xf2,0xfd,0x78,0xbf,0xe9]<br>
+; X86-NEXT: vfnmsub231sd {rz-sae}, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x79,0xbf,0xd1]<br>
+; X86-NEXT: vaddpd %xmm4, %xmm3, %xmm0 ## encoding: [0xc5,0xe1,0x58,0xc4]<br>
+; X86-NEXT: vaddpd %xmm2, %xmm5, %xmm1 ## encoding: [0xc5,0xd1,0x58,0xca]<br>
+; X86-NEXT: vaddpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x58,0xc1]<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: test_int_x86_avx512_mask3_vfnmsub_sd:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: vmovapd %xmm2, %xmm3 ## encoding: [0xc5,0xf9,0x28,0xda]<br>
+; X64-NEXT: vfnmsub231sd %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0xf9,0xbf,0xd9]<br>
+; X64-NEXT: ## xmm3 = -(xmm0 * xmm1) - xmm3<br>
+; X64-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]<br>
+; X64-NEXT: vmovapd %xmm2, %xmm4 ## encoding: [0xc5,0xf9,0x28,0xe2]<br>
+; X64-NEXT: vfnmsub231sd %xmm1, %xmm0, %xmm4 {%k1} ## encoding: [0x62,0xf2,0xfd,0x09,0xbf,0xe1]<br>
+; X64-NEXT: ## xmm4 = -(xmm0 * xmm1) - xmm4<br>
+; X64-NEXT: vmovapd %xmm2, %xmm5 ## encoding: [0xc5,0xf9,0x28,0xea]<br>
+; X64-NEXT: vfnmsub231sd {rz-sae}, %xmm1, %xmm0, %xmm5 ## encoding: [0x62,0xf2,0xfd,0x78,0xbf,0xe9]<br>
+; X64-NEXT: vfnmsub231sd {rz-sae}, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0xfd,0x79,0xbf,0xd1]<br>
+; X64-NEXT: vaddpd %xmm4, %xmm3, %xmm0 ## encoding: [0xc5,0xe1,0x58,0xc4]<br>
+; X64-NEXT: vaddpd %xmm2, %xmm5, %xmm1 ## encoding: [0xc5,0xd1,0x58,0xca]<br>
+; X64-NEXT: vaddpd %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf9,0x58,0xc1]<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %res = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfnmsub.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfnmsub.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1, i32 4)<br>
+ %res1 = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfnmsub.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfnmsub.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 4)<br>
+ %res2 = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfnmsub.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfnmsub.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1, i32 3)<br>
+ %res3 = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfnmsub.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfnmsub.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 3)<br>
+ %res4 = fadd <2 x double> %res, %res1<br>
+ %res5 = fadd <2 x double> %res2, %res3<br>
+ %res6 = fadd <2 x double> %res4, %res5<br>
+ ret <2 x double> %res6<br>
+}<br>
+<br>
+declare <4 x float> @llvm.x86.avx512.mask3.vfnmsub.ss(<4 x float>, <4 x float>, <4 x float>, i8, i32)<br>
+<br>
+define <4 x float>@test_int_x86_avx512_mask3_vfnmsub_ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3,i32 %x4 ){<br>
+; X86-LABEL: test_int_x86_avx512_mask3_vfnmsub_ss:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %al ## encoding: [0x8a,0x44,0x24,0x04]<br>
+; X86-NEXT: vmovaps %xmm2, %xmm3 ## encoding: [0xc5,0xf8,0x28,0xda]<br>
+; X86-NEXT: vfnmsub231ss %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xbf,0xd9]<br>
+; X86-NEXT: ## xmm3 = -(xmm0 * xmm1) - xmm3<br>
+; X86-NEXT: kmovw %eax, %k1 ## encoding: [0xc5,0xf8,0x92,0xc8]<br>
+; X86-NEXT: vmovaps %xmm2, %xmm4 ## encoding: [0xc5,0xf8,0x28,0xe2]<br>
+; X86-NEXT: vfnmsub231ss %xmm1, %xmm0, %xmm4 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xbf,0xe1]<br>
+; X86-NEXT: ## xmm4 = -(xmm0 * xmm1) - xmm4<br>
+; X86-NEXT: vmovaps %xmm2, %xmm5 ## encoding: [0xc5,0xf8,0x28,0xea]<br>
+; X86-NEXT: vfnmsub231ss {rz-sae}, %xmm1, %xmm0, %xmm5 ## encoding: [0x62,0xf2,0x7d,0x78,0xbf,0xe9]<br>
+; X86-NEXT: vfnmsub231ss {rz-sae}, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x79,0xbf,0xd1]<br>
+; X86-NEXT: vaddps %xmm4, %xmm3, %xmm0 ## encoding: [0xc5,0xe0,0x58,0xc4]<br>
+; X86-NEXT: vaddps %xmm2, %xmm5, %xmm1 ## encoding: [0xc5,0xd0,0x58,0xca]<br>
+; X86-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x58,0xc1]<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: test_int_x86_avx512_mask3_vfnmsub_ss:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: vmovaps %xmm2, %xmm3 ## encoding: [0xc5,0xf8,0x28,0xda]<br>
+; X64-NEXT: vfnmsub231ss %xmm1, %xmm0, %xmm3 ## EVEX TO VEX Compression encoding: [0xc4,0xe2,0x79,0xbf,0xd9]<br>
+; X64-NEXT: ## xmm3 = -(xmm0 * xmm1) - xmm3<br>
+; X64-NEXT: kmovw %edi, %k1 ## encoding: [0xc5,0xf8,0x92,0xcf]<br>
+; X64-NEXT: vmovaps %xmm2, %xmm4 ## encoding: [0xc5,0xf8,0x28,0xe2]<br>
+; X64-NEXT: vfnmsub231ss %xmm1, %xmm0, %xmm4 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xbf,0xe1]<br>
+; X64-NEXT: ## xmm4 = -(xmm0 * xmm1) - xmm4<br>
+; X64-NEXT: vmovaps %xmm2, %xmm5 ## encoding: [0xc5,0xf8,0x28,0xea]<br>
+; X64-NEXT: vfnmsub231ss {rz-sae}, %xmm1, %xmm0, %xmm5 ## encoding: [0x62,0xf2,0x7d,0x78,0xbf,0xe9]<br>
+; X64-NEXT: vfnmsub231ss {rz-sae}, %xmm1, %xmm0, %xmm2 {%k1} ## encoding: [0x62,0xf2,0x7d,0x79,0xbf,0xd1]<br>
+; X64-NEXT: vaddps %xmm4, %xmm3, %xmm0 ## encoding: [0xc5,0xe0,0x58,0xc4]<br>
+; X64-NEXT: vaddps %xmm2, %xmm5, %xmm1 ## encoding: [0xc5,0xd0,0x58,0xca]<br>
+; X64-NEXT: vaddps %xmm1, %xmm0, %xmm0 ## encoding: [0xc5,0xf8,0x58,0xc1]<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %res = call <4 x float> @llvm.x86.avx512.mask3.vfnmsub.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1, i32 4)<br>
+ %res1 = call <4 x float> @llvm.x86.avx512.mask3.vfnmsub.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3, i32 4)<br>
+ %res2 = call <4 x float> @llvm.x86.avx512.mask3.vfnmsub.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1, i32 3)<br>
+ %res3 = call <4 x float> @llvm.x86.avx512.mask3.vfnmsub.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3, i32 3)<br>
+ %res4 = fadd <4 x float> %res, %res1<br>
+ %res5 = fadd <4 x float> %res2, %res3<br>
+ %res6 = fadd <4 x float> %res4, %res5<br>
+ ret <4 x float> %res6<br>
+}<br>
+<br>
+define <4 x float>@test_int_x86_avx512_mask3_vfmadd_ss_rm(<4 x float> %x0, <4 x float> %x1, float *%ptr_b ,i8 %x3,i32 %x4) {<br>
+; X86-LABEL: test_int_x86_avx512_mask3_vfmadd_ss_rm:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %cl ## encoding: [0x8a,0x4c,0x24,0x08]<br>
+; X86-NEXT: kmovw %ecx, %k1 ## encoding: [0xc5,0xf8,0x92,0xc9]<br>
+; X86-NEXT: vfmadd231ss (%eax), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xb9,0x08]<br>
+; X86-NEXT: ## xmm1 = (xmm0 * mem) + xmm1<br>
+; X86-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x28,0xc1]<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: test_int_x86_avx512_mask3_vfmadd_ss_rm:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]<br>
+; X64-NEXT: vfmadd231ss (%rdi), %xmm0, %xmm1 {%k1} ## encoding: [0x62,0xf2,0x7d,0x09,0xb9,0x0f]<br>
+; X64-NEXT: ## xmm1 = (xmm0 * mem) + xmm1<br>
+; X64-NEXT: vmovaps %xmm1, %xmm0 ## encoding: [0xc5,0xf8,0x28,0xc1]<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %q = load float, float* %ptr_b<br>
+ %vecinit.i = insertelement <4 x float> undef, float %q, i32 0<br>
+ %res = call <4 x float> @llvm.x86.avx512.mask3.vfmadd.ss(<4 x float> %x0, <4 x float> %vecinit.i, <4 x float> %x1, i8 %x3, i32 4)<br>
+ ret < 4 x float> %res<br>
+}<br>
+<br>
+define <4 x float>@test_int_x86_avx512_mask_vfmadd_ss_rm(<4 x float> %x0, <4 x float> %x1,float *%ptr_b ,i8 %x3,i32 %x4) {<br>
+; X86-LABEL: test_int_x86_avx512_mask_vfmadd_ss_rm:<br>
+; X86: ## %bb.0:<br>
+; X86-NEXT: movl {{[0-9]+}}(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x04]<br>
+; X86-NEXT: movb {{[0-9]+}}(%esp), %cl ## encoding: [0x8a,0x4c,0x24,0x08]<br>
+; X86-NEXT: kmovw %ecx, %k1 ## encoding: [0xc5,0xf8,0x92,0xc9]<br>
+; X86-NEXT: vfmadd132ss (%eax), %xmm1, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x75,0x09,0x99,0x00]<br>
+; X86-NEXT: ## xmm0 = (xmm0 * mem) + xmm1<br>
+; X86-NEXT: retl ## encoding: [0xc3]<br>
+;<br>
+; X64-LABEL: test_int_x86_avx512_mask_vfmadd_ss_rm:<br>
+; X64: ## %bb.0:<br>
+; X64-NEXT: kmovw %esi, %k1 ## encoding: [0xc5,0xf8,0x92,0xce]<br>
+; X64-NEXT: vfmadd132ss (%rdi), %xmm1, %xmm0 {%k1} ## encoding: [0x62,0xf2,0x75,0x09,0x99,0x07]<br>
+; X64-NEXT: ## xmm0 = (xmm0 * mem) + xmm1<br>
+; X64-NEXT: retq ## encoding: [0xc3]<br>
+ %q = load float, float* %ptr_b<br>
+ %vecinit.i = insertelement <4 x float> undef, float %q, i32 0<br>
+ %res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ss(<4 x float> %x0,<4 x float> %vecinit.i, <4 x float> %x1, i8 %x3, i32 4)<br>
+ ret < 4 x float> %res<br>
+}<br>
+<br>
+<br>
+define <4 x float>@test_int_x86_avx512_maskz_vfmadd_ss_rm(<4 x float> %x0, <4 x float> %x1,float *%ptr_b ,i8 %x3,i32 %x4) {<br>
+; CHECK-LABEL: test_int_x86_avx512_maskz_vfmadd_ss_rm:<br>
+; CHECK: ## %bb.0:<br>
+; CHECK-NEXT: vxorps %xmm1, %xmm1, %xmm1 ## encoding: [0xc5,0xf0,0x57,0xc9]<br>
+; CHECK-NEXT: vblendps $1, %xmm1, %xmm0, %xmm0 ## encoding: [0xc4,0xe3,0x79,0x0c,0xc1,0x01]<br>
+; CHECK-NEXT: ## xmm0 = xmm1[0],xmm0[1,2,3]<br>
+; CHECK-NEXT: ret{{[l|q]}} ## encoding: [0xc3]<br>
+ %q = load float, float* %ptr_b<br>
+ %vecinit.i = insertelement <4 x float> undef, float %q, i32 0<br>
+ %res = call <4 x float> @llvm.x86.avx512.maskz.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %vecinit.i, i8 0, i32 4)<br>
+ ret < 4 x float> %res<br>
+}<br>
<br>
Modified: llvm/trunk/test/CodeGen/X86/avx512-intrinsics.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-intrinsics.ll?rev=336871&r1=336870&r2=336871&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-intrinsics.ll?rev=336871&r1=336870&r2=336871&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/X86/avx512-intrinsics.ll (original)<br>
+++ llvm/trunk/test/CodeGen/X86/avx512-intrinsics.ll Wed Jul 11 17:29:56 2018<br>
@@ -4340,7 +4340,8 @@ define <2 x double>@test_int_x86_avx512_<br>
ret <2 x double> %res4<br>
}<br>
<br>
-declare <2 x double> @<a href="http://llvm.x86.avx512.mask.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask.vfmadd.sd</a>(<2 x double>, <2 x double>, <2 x double>, i8, i32)<br>
+declare double @llvm.fma.f64(double, double, double) #1<br>
+declare double @llvm.x86.avx512.vfmadd.f64(double, double, double, i32) #0<br>
<br>
define <2 x double>@test_int_x86_avx512_mask_vfmadd_sd(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3,i32 %x4 ){<br>
; CHECK-LABEL: test_int_x86_avx512_mask_vfmadd_sd:<br>
@@ -4357,18 +4358,38 @@ define <2 x double>@test_int_x86_avx512_<br>
; CHECK-NEXT: vaddpd %xmm0, %xmm5, %xmm0<br>
; CHECK-NEXT: vaddpd %xmm0, %xmm1, %xmm0<br>
; CHECK-NEXT: retq<br>
- %res = call <2 x double> @<a href="http://llvm.x86.avx512.mask.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1, i32 4)<br>
- %res1 = call <2 x double> @<a href="http://llvm.x86.avx512.mask.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 4)<br>
- %res2 = call <2 x double> @<a href="http://llvm.x86.avx512.mask.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1, i32 3)<br>
- %res3 = call <2 x double> @<a href="http://llvm.x86.avx512.mask.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 3)<br>
- %res4 = fadd <2 x double> %res, %res1<br>
- %res5 = fadd <2 x double> %res2, %res3<br>
+ %1 = extractelement <2 x double> %x0, i64 0<br>
+ %2 = extractelement <2 x double> %x1, i64 0<br>
+ %3 = extractelement <2 x double> %x2, i64 0<br>
+ %4 = call double @llvm.fma.f64(double %1, double %2, double %3)<br>
+ %5 = insertelement <2 x double> %x0, double %4, i64 0<br>
+ %6 = extractelement <2 x double> %x0, i64 0<br>
+ %7 = extractelement <2 x double> %x1, i64 0<br>
+ %8 = extractelement <2 x double> %x2, i64 0<br>
+ %9 = call double @llvm.fma.f64(double %6, double %7, double %8)<br>
+ %10 = bitcast i8 %x3 to <8 x i1><br>
+ %11 = extractelement <8 x i1> %10, i64 0<br>
+ %12 = select i1 %11, double %9, double %6<br>
+ %13 = insertelement <2 x double> %x0, double %12, i64 0<br>
+ %14 = extractelement <2 x double> %x0, i64 0<br>
+ %15 = extractelement <2 x double> %x1, i64 0<br>
+ %16 = extractelement <2 x double> %x2, i64 0<br>
+ %17 = call double @llvm.x86.avx512.vfmadd.f64(double %14, double %15, double %16, i32 3)<br>
+ %18 = insertelement <2 x double> %x0, double %17, i64 0<br>
+ %19 = extractelement <2 x double> %x0, i64 0<br>
+ %20 = extractelement <2 x double> %x1, i64 0<br>
+ %21 = extractelement <2 x double> %x2, i64 0<br>
+ %22 = call double @llvm.x86.avx512.vfmadd.f64(double %19, double %20, double %21, i32 3)<br>
+ %23 = bitcast i8 %x3 to <8 x i1><br>
+ %24 = extractelement <8 x i1> %23, i64 0<br>
+ %25 = select i1 %24, double %22, double %19<br>
+ %26 = insertelement <2 x double> %x0, double %25, i64 0<br>
+ %res4 = fadd <2 x double> %5, %13<br>
+ %res5 = fadd <2 x double> %18, %26<br>
%res6 = fadd <2 x double> %res4, %res5<br>
ret <2 x double> %res6<br>
}<br>
<br>
-declare <4 x float> @llvm.x86.avx512.mask.vfmadd.ss(<4 x float>, <4 x float>, <4 x float>, i8, i32)<br>
-<br>
define <4 x float>@test_int_x86_avx512_mask_vfmadd_ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3,i32 %x4 ){<br>
; CHECK-LABEL: test_int_x86_avx512_mask_vfmadd_ss:<br>
; CHECK: ## %bb.0:<br>
@@ -4384,18 +4405,38 @@ define <4 x float>@test_int_x86_avx512_m<br>
; CHECK-NEXT: vaddps %xmm0, %xmm5, %xmm0<br>
; CHECK-NEXT: vaddps %xmm0, %xmm1, %xmm0<br>
; CHECK-NEXT: retq<br>
- %res = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1, i32 4)<br>
- %res1 = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3, i32 4)<br>
- %res2 = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 -1, i32 3)<br>
- %res3 = call <4 x float> @llvm.x86.avx512.mask.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3, i32 3)<br>
- %res4 = fadd <4 x float> %res, %res1<br>
- %res5 = fadd <4 x float> %res2, %res3<br>
+ %1 = extractelement <4 x float> %x0, i64 0<br>
+ %2 = extractelement <4 x float> %x1, i64 0<br>
+ %3 = extractelement <4 x float> %x2, i64 0<br>
+ %4 = call float @llvm.fma.f32(float %1, float %2, float %3)<br>
+ %5 = insertelement <4 x float> %x0, float %4, i64 0<br>
+ %6 = extractelement <4 x float> %x0, i64 0<br>
+ %7 = extractelement <4 x float> %x1, i64 0<br>
+ %8 = extractelement <4 x float> %x2, i64 0<br>
+ %9 = call float @llvm.fma.f32(float %6, float %7, float %8)<br>
+ %10 = bitcast i8 %x3 to <8 x i1><br>
+ %11 = extractelement <8 x i1> %10, i64 0<br>
+ %12 = select i1 %11, float %9, float %6<br>
+ %13 = insertelement <4 x float> %x0, float %12, i64 0<br>
+ %14 = extractelement <4 x float> %x0, i64 0<br>
+ %15 = extractelement <4 x float> %x1, i64 0<br>
+ %16 = extractelement <4 x float> %x2, i64 0<br>
+ %17 = call float @llvm.x86.avx512.vfmadd.f32(float %14, float %15, float %16, i32 3)<br>
+ %18 = insertelement <4 x float> %x0, float %17, i64 0<br>
+ %19 = extractelement <4 x float> %x0, i64 0<br>
+ %20 = extractelement <4 x float> %x1, i64 0<br>
+ %21 = extractelement <4 x float> %x2, i64 0<br>
+ %22 = call float @llvm.x86.avx512.vfmadd.f32(float %19, float %20, float %21, i32 3)<br>
+ %23 = bitcast i8 %x3 to <8 x i1><br>
+ %24 = extractelement <8 x i1> %23, i64 0<br>
+ %25 = select i1 %24, float %22, float %19<br>
+ %26 = insertelement <4 x float> %x0, float %25, i64 0<br>
+ %res4 = fadd <4 x float> %5, %13<br>
+ %res5 = fadd <4 x float> %18, %26<br>
%res6 = fadd <4 x float> %res4, %res5<br>
ret <4 x float> %res6<br>
}<br>
<br>
-declare <2 x double> @<a href="http://llvm.x86.avx512.maskz.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.maskz.vfmadd.sd</a>(<2 x double>, <2 x double>, <2 x double>, i8, i32)<br>
-<br>
define <2 x double>@test_int_x86_avx512_maskz_vfmadd_sd(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3,i32 %x4 ){<br>
; CHECK-LABEL: test_int_x86_avx512_maskz_vfmadd_sd:<br>
; CHECK: ## %bb.0:<br>
@@ -4405,13 +4446,28 @@ define <2 x double>@test_int_x86_avx512_<br>
; CHECK-NEXT: vfmadd213sd {rz-sae}, %xmm2, %xmm1, %xmm0 {%k1} {z}<br>
; CHECK-NEXT: vaddpd %xmm0, %xmm3, %xmm0<br>
; CHECK-NEXT: retq<br>
- %res = call <2 x double> @<a href="http://llvm.x86.avx512.maskz.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.maskz.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 4)<br>
- %res1 = call <2 x double> @<a href="http://llvm.x86.avx512.maskz.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.maskz.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 3)<br>
- %res2 = fadd <2 x double> %res, %res1<br>
+ %1 = extractelement <2 x double> %x0, i64 0<br>
+ %2 = extractelement <2 x double> %x1, i64 0<br>
+ %3 = extractelement <2 x double> %x2, i64 0<br>
+ %4 = call double @llvm.fma.f64(double %1, double %2, double %3)<br>
+ %5 = bitcast i8 %x3 to <8 x i1><br>
+ %6 = extractelement <8 x i1> %5, i64 0<br>
+ %7 = select i1 %6, double %4, double 0.000000e+00<br>
+ %8 = insertelement <2 x double> %x0, double %7, i64 0<br>
+ %9 = extractelement <2 x double> %x0, i64 0<br>
+ %10 = extractelement <2 x double> %x1, i64 0<br>
+ %11 = extractelement <2 x double> %x2, i64 0<br>
+ %12 = call double @llvm.x86.avx512.vfmadd.f64(double %9, double %10, double %11, i32 3)<br>
+ %13 = bitcast i8 %x3 to <8 x i1><br>
+ %14 = extractelement <8 x i1> %13, i64 0<br>
+ %15 = select i1 %14, double %12, double 0.000000e+00<br>
+ %16 = insertelement <2 x double> %x0, double %15, i64 0<br>
+ %res2 = fadd <2 x double> %8, %16<br>
ret <2 x double> %res2<br>
}<br>
<br>
-declare <4 x float> @llvm.x86.avx512.maskz.vfmadd.ss(<4 x float>, <4 x float>, <4 x float>, i8, i32)<br>
+declare float @llvm.fma.f32(float, float, float) #1<br>
+declare float @llvm.x86.avx512.vfmadd.f32(float, float, float, i32) #0<br>
<br>
define <4 x float>@test_int_x86_avx512_maskz_vfmadd_ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3,i32 %x4 ){<br>
; CHECK-LABEL: test_int_x86_avx512_maskz_vfmadd_ss:<br>
@@ -4419,12 +4475,25 @@ define <4 x float>@test_int_x86_avx512_m<br>
; CHECK-NEXT: kmovw %edi, %k1<br>
; CHECK-NEXT: vfmadd213ss {{.*#+}} xmm0 = (xmm1 * xmm0) + xmm2<br>
; CHECK-NEXT: retq<br>
- %res = call <4 x float> @llvm.x86.avx512.maskz.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3, i32 4)<br>
- %res1 = call <4 x float> @llvm.x86.avx512.maskz.vfmadd.ss(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, i8 %x3, i32 3)<br>
- %res2 = fadd <4 x float> %res, %res1<br>
- ret <4 x float> %res<br>
+ %1 = extractelement <4 x float> %x0, i64 0<br>
+ %2 = extractelement <4 x float> %x1, i64 0<br>
+ %3 = extractelement <4 x float> %x2, i64 0<br>
+ %4 = call float @llvm.fma.f32(float %1, float %2, float %3)<br>
+ %5 = bitcast i8 %x3 to <8 x i1><br>
+ %6 = extractelement <8 x i1> %5, i64 0<br>
+ %7 = select i1 %6, float %4, float 0.000000e+00<br>
+ %8 = insertelement <4 x float> %x0, float %7, i64 0<br>
+ %9 = extractelement <4 x float> %x0, i64 0<br>
+ %10 = extractelement <4 x float> %x1, i64 0<br>
+ %11 = extractelement <4 x float> %x2, i64 0<br>
+ %12 = call float @llvm.x86.avx512.vfmadd.f32(float %9, float %10, float %11, i32 3)<br>
+ %13 = bitcast i8 %x3 to <8 x i1><br>
+ %14 = extractelement <8 x i1> %13, i64 0<br>
+ %15 = select i1 %14, float %12, float 0.000000e+00<br>
+ %16 = insertelement <4 x float> %x0, float %15, i64 0<br>
+ %res2 = fadd <4 x float> %8, %16<br>
+ ret <4 x float> %8<br>
}<br>
-declare <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmadd.sd</a>(<2 x double>, <2 x double>, <2 x double>, i8, i32)<br>
<br>
define <2 x double>@test_int_x86_avx512_mask3_vfmadd_sd(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3,i32 %x4 ){<br>
; CHECK-LABEL: test_int_x86_avx512_mask3_vfmadd_sd:<br>
@@ -4441,18 +4510,38 @@ define <2 x double>@test_int_x86_avx512_<br>
; CHECK-NEXT: vaddpd %xmm2, %xmm5, %xmm1<br>
; CHECK-NEXT: vaddpd %xmm1, %xmm0, %xmm0<br>
; CHECK-NEXT: retq<br>
- %res = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1, i32 4)<br>
- %res1 = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 4)<br>
- %res2 = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 -1, i32 3)<br>
- %res3 = call <2 x double> @<a href="http://llvm.x86.avx512.mask3.vfmadd.sd" rel="noreferrer" target="_blank">llvm.x86.avx512.mask3.vfmadd.sd</a>(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, i8 %x3, i32 3)<br>
- %res4 = fadd <2 x double> %res, %res1<br>
- %res5 = fadd <2 x double> %res2, %res3<br>
+ %1 = extractelement <2 x double> %x0, i64 0<br>
+ %2 = extractelement <2 x double> %x1, i64 0<br>
+ %3 = extractelement <2 x double> %x2, i64 0<br>
+ %4 = call double @llvm.fma.f64(double %1, double %2, double %3)<br>
+ %5 = insertelement <2 x double> %x2, double %4, i64 0<br>
+ %6 = extractelement <2 x double> %x0, i64 0<br>
+ %7 = extractelement <2 x double> %x1, i64 0<br>
+ %8_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
</blockquote></div></div></blockquote></div>