<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;}
.MsoPapDefault
{mso-style-type:export-only;
margin-left:46.2pt;
text-indent:-17.85pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:799957587;
mso-list-type:hybrid;
mso-list-template-ids:-1214880770 44585902 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
{mso-level-start-at:0;
mso-level-number-format:bullet;
mso-level-text:-;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-font-family:"Times New Roman";}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><a name="_MailEndCompose">> What semantics do these intrinsics have, should they be added to the langref?</a><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Definitely, I’m going to do this in one of the next commits. I just have to complete the codegen part, otherwise intrinsics that appear in LangRef will not be
fully supported. <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal" style="margin-left:36.0pt;text-indent:-18.0pt;mso-list:l0 level1 lfo1">
<![if !supportLists]><span style="font-family:"Calibri",sans-serif;color:#2F5496"><span style="mso-list:Ignore">-<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span dir="LTR"></span><b><i><span style="color:#2F5496"> Elena<o:p></o:p></span></i></b></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><a name="_____replyseparator"></a><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"> David Majnemer [mailto:david.majnemer@gmail.com]
<br>
<b>Sent:</b> Thursday, November 03, 2016 05:55<br>
<b>To:</b> Demikhovsky, Elena <elena.demikhovsky@intel.com><br>
<b>Cc:</b> llvm-commits <llvm-commits@lists.llvm.org><br>
<b>Subject:</b> Re: [llvm] r285876 - Expandload and Compressstore intrinsics<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt">What semantics do these intrinsics have, should they be added to the langref?<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt"><o:p> </o:p></p>
<div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt">On Wed, Nov 2, 2016 at 8:23 PM, Elena Demikhovsky via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>> wrote:<o:p></o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm">
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt">Author: delena<br>
Date: Wed Nov 2 22:23:55 2016<br>
New Revision: 285876<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=285876&view=rev" target="_blank">
http://llvm.org/viewvc/llvm-project?rev=285876&view=rev</a><br>
Log:<br>
Expandload and Compressstore intrinsics<br>
<br>
2 new intrinsics covering AVX-512 compress/expand functionality.<br>
This implementation includes syntax, DAG builder, operation lowering and tests.<br>
Does not include: handling of illegal data types, codegen prepare pass and the cost model.<br>
<br>
<br>
Added:<br>
llvm/trunk/test/CodeGen/X86/compress_expand.ll<br>
Modified:<br>
llvm/trunk/include/llvm/IR/Intrinsics.h<br>
llvm/trunk/include/llvm/IR/Intrinsics.td<br>
llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp<br>
llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp<br>
llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp<br>
llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h<br>
llvm/trunk/lib/IR/Function.cpp<br>
llvm/trunk/lib/Target/X86/X86ISelLowering.cpp<br>
llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td<br>
llvm/trunk/utils/TableGen/CodeGenTarget.cpp<br>
llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp<br>
<br>
Modified: llvm/trunk/include/llvm/IR/Intrinsics.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Intrinsics.h?rev=285876&r1=285875&r2=285876&view=diff" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Intrinsics.h?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/include/llvm/IR/Intrinsics.h (original)<br>
+++ llvm/trunk/include/llvm/IR/Intrinsics.h Wed Nov 2 22:23:55 2016<br>
@@ -100,7 +100,7 @@ namespace Intrinsic {<br>
Void, VarArg, MMX, Token, Metadata, Half, Float, Double,<br>
Integer, Vector, Pointer, Struct,<br>
Argument, ExtendArgument, TruncArgument, HalfVecArgument,<br>
- SameVecWidthArgument, PtrToArgument, VecOfPtrsToElt<br>
+ SameVecWidthArgument, PtrToArgument, PtrToElt, VecOfPtrsToElt<br>
} Kind;<br>
<br>
union {<br>
@@ -123,7 +123,7 @@ namespace Intrinsic {<br>
assert(Kind == Argument || Kind == ExtendArgument ||<br>
Kind == TruncArgument || Kind == HalfVecArgument ||<br>
Kind == SameVecWidthArgument || Kind == PtrToArgument ||<br>
- Kind == VecOfPtrsToElt);<br>
+ Kind == PtrToElt || Kind == VecOfPtrsToElt);<br>
return Argument_Info >> 3;<br>
}<br>
ArgKind getArgumentKind() const {<br>
<br>
Modified: llvm/trunk/include/llvm/IR/Intrinsics.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Intrinsics.td?rev=285876&r1=285875&r2=285876&view=diff" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Intrinsics.td?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/include/llvm/IR/Intrinsics.td (original)<br>
+++ llvm/trunk/include/llvm/IR/Intrinsics.td Wed Nov 2 22:23:55 2016<br>
@@ -133,6 +133,7 @@ class LLVMVectorSameWidth<int num, LLVMT<br>
ValueType ElTy = elty.VT;<br>
}<br>
class LLVMPointerTo<int num> : LLVMMatchType<num>;<br>
+class LLVMPointerToElt<int num> : LLVMMatchType<num>;<br>
class LLVMVectorOfPointersToElt<int num> : LLVMMatchType<num>;<br>
<br>
// Match the type of another intrinsic parameter that is expected to be a<br>
@@ -718,13 +719,25 @@ def int_masked_gather: Intrinsic<[llvm_a<br>
[LLVMVectorOfPointersToElt<0>, llvm_i32_ty,<br>
LLVMVectorSameWidth<0, llvm_i1_ty>,<br>
LLVMMatchType<0>],<br>
- [IntrReadMem]>;<br>
+ [IntrReadMem]>;<br>
<br>
def int_masked_scatter: Intrinsic<[],<br>
[llvm_anyvector_ty,<br>
LLVMVectorOfPointersToElt<0>, llvm_i32_ty,<br>
LLVMVectorSameWidth<0, llvm_i1_ty>]>;<br>
<br>
+def int_masked_expandload: Intrinsic<[llvm_anyvector_ty],<br>
+ [LLVMPointerToElt<0>,<br>
+ LLVMVectorSameWidth<0, llvm_i1_ty>,<br>
+ LLVMMatchType<0>],<br>
+ [IntrReadMem]>;<br>
+<br>
+def int_masked_compressstore: Intrinsic<[],<br>
+ [llvm_anyvector_ty,<br>
+ LLVMPointerToElt<0>,<br>
+ LLVMVectorSameWidth<0, llvm_i1_ty>],<br>
+ [IntrArgMemOnly]>;<br>
+<br>
// Test whether a pointer is associated with a type metadata identifier.<br>
def int_type_test : Intrinsic<[llvm_i1_ty], [llvm_ptr_ty, llvm_metadata_ty],<br>
[IntrNoMem]>;<br>
<br>
Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=285876&r1=285875&r2=285876&view=diff" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)<br>
+++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Wed Nov 2 22:23:55 2016<br>
@@ -5583,7 +5583,7 @@ SDValue DAGCombiner::visitMSTORE(SDNode<br>
Alignment, MST->getAAInfo(), MST->getRanges());<br>
<br>
Lo = DAG.getMaskedStore(Chain, DL, DataLo, Ptr, MaskLo, LoMemVT, MMO,<br>
- MST->isTruncatingStore());<br>
+ MST->isTruncatingStore(), MST->isCompressingStore());<br>
<br>
unsigned IncrementSize = LoMemVT.getSizeInBits()/8;<br>
Ptr = DAG.getNode(ISD::ADD, DL, Ptr.getValueType(), Ptr,<br>
@@ -5596,7 +5596,7 @@ SDValue DAGCombiner::visitMSTORE(SDNode<br>
MST->getRanges());<br>
<br>
Hi = DAG.getMaskedStore(Chain, DL, DataHi, Ptr, MaskHi, HiMemVT, MMO,<br>
- MST->isTruncatingStore());<br>
+ MST->isTruncatingStore(), MST->isCompressingStore());<br>
<br>
AddToWorklist(Lo.getNode());<br>
AddToWorklist(Hi.getNode());<br>
<br>
Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp?rev=285876&r1=285875&r2=285876&view=diff" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp (original)<br>
+++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp Wed Nov 2 22:23:55 2016<br>
@@ -1212,7 +1212,7 @@ SDValue DAGTypeLegalizer::PromoteIntOp_M<br>
<br>
return DAG.getMaskedStore(N->getChain(), dl, DataOp, N->getBasePtr(), Mask,<br>
N->getMemoryVT(), N->getMemOperand(),<br>
- TruncateStore);<br>
+ TruncateStore, N->isCompressingStore());<br>
}<br>
<br>
SDValue DAGTypeLegalizer::PromoteIntOp_MLOAD(MaskedLoadSDNode *N,<br>
<br>
Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp?rev=285876&r1=285875&r2=285876&view=diff" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (original)<br>
+++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp Wed Nov 2 22:23:55 2016<br>
@@ -3667,16 +3667,39 @@ void SelectionDAGBuilder::visitStore(con<br>
DAG.setRoot(StoreNode);<br>
}<br>
<br>
-void SelectionDAGBuilder::visitMaskedStore(const CallInst &I) {<br>
+void SelectionDAGBuilder::visitMaskedStore(const CallInst &I,<br>
+ bool IsCompressing) {<br>
SDLoc sdl = getCurSDLoc();<br>
<br>
- // llvm.masked.store.*(Src0, Ptr, alignment, Mask)<br>
- Value *PtrOperand = I.getArgOperand(1);<br>
+ auto getMaskedStoreOps = [&](Value* &Ptr, Value* &Mask, Value* &Src0,<br>
+ unsigned& Alignment) {<br>
+ // llvm.masked.store.*(Src0, Ptr, alignment, Mask)<br>
+ Src0 = I.getArgOperand(0);<br>
+ Ptr = I.getArgOperand(1);<br>
+ Alignment = cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();<br>
+ Mask = I.getArgOperand(3);<br>
+ };<br>
+ auto getCompressingStoreOps = [&](Value* &Ptr, Value* &Mask, Value* &Src0,<br>
+ unsigned& Alignment) {<br>
+ // llvm.masked.compressstore.*(Src0, Ptr, Mask)<br>
+ Src0 = I.getArgOperand(0);<br>
+ Ptr = I.getArgOperand(1);<br>
+ Mask = I.getArgOperand(2);<br>
+ Alignment = 0;<br>
+ };<br>
+<br>
+ Value *PtrOperand, *MaskOperand, *Src0Operand;<br>
+ unsigned Alignment;<br>
+ if (IsCompressing)<br>
+ getCompressingStoreOps(PtrOperand, MaskOperand, Src0Operand, Alignment);<br>
+ else<br>
+ getMaskedStoreOps(PtrOperand, MaskOperand, Src0Operand, Alignment);<br>
+<br>
SDValue Ptr = getValue(PtrOperand);<br>
- SDValue Src0 = getValue(I.getArgOperand(0));<br>
- SDValue Mask = getValue(I.getArgOperand(3));<br>
+ SDValue Src0 = getValue(Src0Operand);<br>
+ SDValue Mask = getValue(MaskOperand);<br>
+<br>
EVT VT = Src0.getValueType();<br>
- unsigned Alignment = (cast<ConstantInt>(I.getArgOperand(2)))->getZExtValue();<br>
if (!Alignment)<br>
Alignment = DAG.getEVTAlignment(VT);<br>
<br>
@@ -3689,7 +3712,8 @@ void SelectionDAGBuilder::visitMaskedSto<br>
MachineMemOperand::MOStore, VT.getStoreSize(),<br>
Alignment, AAInfo);<br>
SDValue StoreNode = DAG.getMaskedStore(getRoot(), sdl, Src0, Ptr, Mask, VT,<br>
- MMO, false);<br>
+ MMO, false /* Truncating */,<br>
+ IsCompressing);<br>
DAG.setRoot(StoreNode);<br>
setValue(&I, StoreNode);<br>
}<br>
@@ -3710,7 +3734,7 @@ void SelectionDAGBuilder::visitMaskedSto<br>
// extract the spalt value and use it as a uniform base.<br>
// In all other cases the function returns 'false'.<br>
//<br>
-static bool getUniformBase(const Value *& Ptr, SDValue& Base, SDValue& Index,<br>
+static bool getUniformBase(const Value* &Ptr, SDValue& Base, SDValue& Index,<br>
SelectionDAGBuilder* SDB) {<br>
<br>
SelectionDAG& DAG = SDB->DAG;<br>
@@ -3790,18 +3814,38 @@ void SelectionDAGBuilder::visitMaskedSca<br>
setValue(&I, Scatter);<br>
}<br>
<br>
-void SelectionDAGBuilder::visitMaskedLoad(const CallInst &I) {<br>
+void SelectionDAGBuilder::visitMaskedLoad(const CallInst &I, bool IsExpanding) {<br>
SDLoc sdl = getCurSDLoc();<br>
<br>
- // @llvm.masked.load.*(Ptr, alignment, Mask, Src0)<br>
- Value *PtrOperand = I.getArgOperand(0);<br>
+ auto getMaskedLoadOps = [&](Value* &Ptr, Value* &Mask, Value* &Src0,<br>
+ unsigned& Alignment) {<br>
+ // @llvm.masked.load.*(Ptr, alignment, Mask, Src0)<br>
+ Ptr = I.getArgOperand(0);<br>
+ Alignment = cast<ConstantInt>(I.getArgOperand(1))->getZExtValue();<br>
+ Mask = I.getArgOperand(2);<br>
+ Src0 = I.getArgOperand(3);<br>
+ };<br>
+ auto getExpandingLoadOps = [&](Value* &Ptr, Value* &Mask, Value* &Src0,<br>
+ unsigned& Alignment) {<br>
+ // @llvm.masked.expandload.*(Ptr, Mask, Src0)<br>
+ Ptr = I.getArgOperand(0);<br>
+ Alignment = 0;<br>
+ Mask = I.getArgOperand(1);<br>
+ Src0 = I.getArgOperand(2);<br>
+ };<br>
+<br>
+ Value *PtrOperand, *MaskOperand, *Src0Operand;<br>
+ unsigned Alignment;<br>
+ if (IsExpanding)<br>
+ getExpandingLoadOps(PtrOperand, MaskOperand, Src0Operand, Alignment);<br>
+ else<br>
+ getMaskedLoadOps(PtrOperand, MaskOperand, Src0Operand, Alignment);<br>
+<br>
SDValue Ptr = getValue(PtrOperand);<br>
- SDValue Src0 = getValue(I.getArgOperand(3));<br>
- SDValue Mask = getValue(I.getArgOperand(2));<br>
+ SDValue Src0 = getValue(Src0Operand);<br>
+ SDValue Mask = getValue(MaskOperand);<br>
<br>
- const TargetLowering &TLI = DAG.getTargetLoweringInfo();<br>
- EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());<br>
- unsigned Alignment = (cast<ConstantInt>(I.getArgOperand(1)))->getZExtValue();<br>
+ EVT VT = Src0.getValueType();<br>
if (!Alignment)<br>
Alignment = DAG.getEVTAlignment(VT);<br>
<br>
@@ -3821,7 +3865,7 @@ void SelectionDAGBuilder::visitMaskedLoa<br>
Alignment, AAInfo, Ranges);<br>
<br>
SDValue Load = DAG.getMaskedLoad(VT, sdl, InChain, Ptr, Mask, Src0, VT, MMO,<br>
- ISD::NON_EXTLOAD, false);<br>
+ ISD::NON_EXTLOAD, IsExpanding);<br>
if (AddToChain) {<br>
SDValue OutChain = Load.getValue(1);<br>
DAG.setRoot(OutChain);<br>
@@ -5054,6 +5098,12 @@ SelectionDAGBuilder::visitIntrinsicCall(<br>
case Intrinsic::masked_store:<br>
visitMaskedStore(I);<br>
return nullptr;<br>
+ case Intrinsic::masked_expandload:<br>
+ visitMaskedLoad(I, true /* IsExpanding */);<br>
+ return nullptr;<br>
+ case Intrinsic::masked_compressstore:<br>
+ visitMaskedStore(I, true /* IsCompressing */);<br>
+ return nullptr;<br>
case Intrinsic::x86_mmx_pslli_w:<br>
case Intrinsic::x86_mmx_pslli_d:<br>
case Intrinsic::x86_mmx_pslli_q:<br>
<br>
Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h?rev=285876&r1=285875&r2=285876&view=diff" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h (original)<br>
+++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h Wed Nov 2 22:23:55 2016<br>
@@ -874,8 +874,8 @@ private:<br>
void visitAlloca(const AllocaInst &I);<br>
void visitLoad(const LoadInst &I);<br>
void visitStore(const StoreInst &I);<br>
- void visitMaskedLoad(const CallInst &I);<br>
- void visitMaskedStore(const CallInst &I);<br>
+ void visitMaskedLoad(const CallInst &I, bool IsExpanding = false);<br>
+ void visitMaskedStore(const CallInst &I, bool IsCompressing = false);<br>
void visitMaskedGather(const CallInst &I);<br>
void visitMaskedScatter(const CallInst &I);<br>
void visitAtomicCmpXchg(const AtomicCmpXchgInst &I);<br>
<br>
Modified: llvm/trunk/lib/IR/Function.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/Function.cpp?rev=285876&r1=285875&r2=285876&view=diff" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/Function.cpp?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/IR/Function.cpp (original)<br>
+++ llvm/trunk/lib/IR/Function.cpp Wed Nov 2 22:23:55 2016<br>
@@ -607,10 +607,11 @@ enum IIT_Info {<br>
IIT_HALF_VEC_ARG = 30,<br>
IIT_SAME_VEC_WIDTH_ARG = 31,<br>
IIT_PTR_TO_ARG = 32,<br>
- IIT_VEC_OF_PTRS_TO_ELT = 33,<br>
- IIT_I128 = 34,<br>
- IIT_V512 = 35,<br>
- IIT_V1024 = 36<br>
+ IIT_PTR_TO_ELT = 33,<br>
+ IIT_VEC_OF_PTRS_TO_ELT = 34,<br>
+ IIT_I128 = 35,<br>
+ IIT_V512 = 36,<br>
+ IIT_V1024 = 37<br>
};<br>
<br>
<br>
@@ -744,6 +745,11 @@ static void DecodeIITType(unsigned &Next<br>
ArgInfo));<br>
return;<br>
}<br>
+ case IIT_PTR_TO_ELT: {<br>
+ unsigned ArgInfo = (NextElt == Infos.size() ? 0 : Infos[NextElt++]);<br>
+ OutputTable.push_back(IITDescriptor::get(IITDescriptor::PtrToElt, ArgInfo));<br>
+ return;<br>
+ }<br>
case IIT_VEC_OF_PTRS_TO_ELT: {<br>
unsigned ArgInfo = (NextElt == Infos.size() ? 0 : Infos[NextElt++]);<br>
OutputTable.push_back(IITDescriptor::get(IITDescriptor::VecOfPtrsToElt,<br>
@@ -870,6 +876,14 @@ static Type *DecodeFixedType(ArrayRef<In<br>
Type *Ty = Tys[D.getArgumentNumber()];<br>
return PointerType::getUnqual(Ty);<br>
}<br>
+ case IITDescriptor::PtrToElt: {<br>
+ Type *Ty = Tys[D.getArgumentNumber()];<br>
+ VectorType *VTy = dyn_cast<VectorType>(Ty);<br>
+ if (!VTy)<br>
+ llvm_unreachable("Expected an argument of Vector Type");<br>
+ Type *EltTy = VTy->getVectorElementType();<br>
+ return PointerType::getUnqual(EltTy);<br>
+ }<br>
case IITDescriptor::VecOfPtrsToElt: {<br>
Type *Ty = Tys[D.getArgumentNumber()];<br>
VectorType *VTy = dyn_cast<VectorType>(Ty);<br>
@@ -1048,7 +1062,7 @@ bool Intrinsic::matchIntrinsicType(Type<br>
if (D.getArgumentNumber() >= ArgTys.size())<br>
return true;<br>
VectorType * ReferenceType =<br>
- dyn_cast<VectorType>(ArgTys[D.getArgumentNumber()]);<br>
+ dyn_cast<VectorType>(ArgTys[D.getArgumentNumber()]);<br>
VectorType *ThisArgType = dyn_cast<VectorType>(Ty);<br>
if (!ThisArgType || !ReferenceType ||<br>
(ReferenceType->getVectorNumElements() !=<br>
@@ -1064,6 +1078,16 @@ bool Intrinsic::matchIntrinsicType(Type<br>
PointerType *ThisArgType = dyn_cast<PointerType>(Ty);<br>
return (!ThisArgType || ThisArgType->getElementType() != ReferenceType);<br>
}<br>
+ case IITDescriptor::PtrToElt: {<br>
+ if (D.getArgumentNumber() >= ArgTys.size())<br>
+ return true;<br>
+ VectorType * ReferenceType =<br>
+ dyn_cast<VectorType> (ArgTys[D.getArgumentNumber()]);<br>
+ PointerType *ThisArgType = dyn_cast<PointerType>(Ty);<br>
+<br>
+ return (!ThisArgType || !ReferenceType ||<br>
+ ThisArgType->getElementType() != ReferenceType->getElementType());<br>
+ }<br>
case IITDescriptor::VecOfPtrsToElt: {<br>
if (D.getArgumentNumber() >= ArgTys.size())<br>
return true;<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=285876&r1=285875&r2=285876&view=diff" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)<br>
+++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Nov 2 22:23:55 2016<br>
@@ -1232,10 +1232,11 @@ X86TargetLowering::X86TargetLowering(con<br>
setTruncStoreAction(MVT::v4i32, MVT::v4i8, Legal);<br>
setTruncStoreAction(MVT::v4i32, MVT::v4i16, Legal);<br>
} else {<br>
- setOperationAction(ISD::MLOAD, MVT::v8i32, Custom);<br>
- setOperationAction(ISD::MLOAD, MVT::v8f32, Custom);<br>
- setOperationAction(ISD::MSTORE, MVT::v8i32, Custom);<br>
- setOperationAction(ISD::MSTORE, MVT::v8f32, Custom);<br>
+ for (auto VT : {MVT::v4i32, MVT::v8i32, MVT::v2i64, MVT::v4i64,<br>
+ MVT::v4f32, MVT::v8f32, MVT::v2f64, MVT::v4f64}) {<br>
+ setOperationAction(ISD::MLOAD, VT, Custom);<br>
+ setOperationAction(ISD::MSTORE, VT, Custom);<br>
+ }<br>
}<br>
setOperationAction(ISD::TRUNCATE, MVT::i1, Custom);<br>
setOperationAction(ISD::TRUNCATE, MVT::v16i8, Custom);<br>
@@ -21940,26 +21941,48 @@ static SDValue LowerMLOAD(SDValue Op, co<br>
SDValue Mask = N->getMask();<br>
SDLoc dl(Op);<br>
<br>
+ assert((!N->isExpandingLoad() || Subtarget.hasAVX512()) &&<br>
+ "Expanding masked load is supported on AVX-512 target only!");<br>
+<br>
+ assert((!N->isExpandingLoad() || ScalarVT.getSizeInBits() >= 32) &&<br>
+ "Expanding masked load is supported for 32 and 64-bit types only!");<br>
+<br>
+ // 4x32, 4x64 and 2x64 vectors of non-expanding loads are legal regardless of<br>
+ // VLX. These types for exp-loads are handled here.<br>
+ if (!N->isExpandingLoad() && VT.getVectorNumElements() <= 4)<br>
+ return Op;<br>
+<br>
assert(Subtarget.hasAVX512() && !Subtarget.hasVLX() && !VT.is512BitVector() &&<br>
"Cannot lower masked load op.");<br>
<br>
- assert(((ScalarVT == MVT::i32 || ScalarVT == MVT::f32) ||<br>
+ assert((ScalarVT.getSizeInBits() >= 32 ||<br>
(Subtarget.hasBWI() &&<br>
(ScalarVT == MVT::i8 || ScalarVT == MVT::i16))) &&<br>
"Unsupported masked load op.");<br>
<br>
// This operation is legal for targets with VLX, but without<br>
// VLX the vector should be widened to 512 bit<br>
- unsigned NumEltsInWideVec = 512/VT.getScalarSizeInBits();<br>
+ unsigned NumEltsInWideVec = 512 / VT.getScalarSizeInBits();<br>
MVT WideDataVT = MVT::getVectorVT(ScalarVT, NumEltsInWideVec);<br>
- MVT WideMaskVT = MVT::getVectorVT(MVT::i1, NumEltsInWideVec);<br>
SDValue Src0 = N->getSrc0();<br>
Src0 = ExtendToType(Src0, WideDataVT, DAG);<br>
+<br>
+ // Mask element has to be i1<br>
+ MVT MaskEltTy = Mask.getSimpleValueType().getScalarType();<br>
+ assert((MaskEltTy == MVT::i1 || VT.getVectorNumElements() <= 4) &&<br>
+ "We handle 4x32, 4x64 and 2x64 vectors only in this casse");<br>
+<br>
+ MVT WideMaskVT = MVT::getVectorVT(MaskEltTy, NumEltsInWideVec);<br>
+<br>
Mask = ExtendToType(Mask, WideMaskVT, DAG, true);<br>
+ if (MaskEltTy != MVT::i1)<br>
+ Mask = DAG.getNode(ISD::TRUNCATE, dl,<br>
+ MVT::getVectorVT(MVT::i1, NumEltsInWideVec), Mask);<br>
SDValue NewLoad = DAG.getMaskedLoad(WideDataVT, dl, N->getChain(),<br>
N->getBasePtr(), Mask, Src0,<br>
N->getMemoryVT(), N->getMemOperand(),<br>
- N->getExtensionType());<br>
+ N->getExtensionType(),<br>
+ N->isExpandingLoad());<br>
<br>
SDValue Exract = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, VT,<br>
NewLoad.getValue(0),<br>
@@ -21977,10 +22000,20 @@ static SDValue LowerMSTORE(SDValue Op, c<br>
SDValue Mask = N->getMask();<br>
SDLoc dl(Op);<br>
<br>
+ assert((!N->isCompressingStore() || Subtarget.hasAVX512()) &&<br>
+ "Expanding masked load is supported on AVX-512 target only!");<br>
+<br>
+ assert((!N->isCompressingStore() || ScalarVT.getSizeInBits() >= 32) &&<br>
+ "Expanding masked load is supported for 32 and 64-bit types only!");<br>
+<br>
+ // 4x32 and 2x64 vectors of non-compressing stores are legal regardless to VLX.<br>
+ if (!N->isCompressingStore() && VT.getVectorNumElements() <= 4)<br>
+ return Op;<br>
+<br>
assert(Subtarget.hasAVX512() && !Subtarget.hasVLX() && !VT.is512BitVector() &&<br>
"Cannot lower masked store op.");<br>
<br>
- assert(((ScalarVT == MVT::i32 || ScalarVT == MVT::f32) ||<br>
+ assert((ScalarVT.getSizeInBits() >= 32 ||<br>
(Subtarget.hasBWI() &&<br>
(ScalarVT == MVT::i8 || ScalarVT == MVT::i16))) &&<br>
"Unsupported masked store op.");<br>
@@ -21989,12 +22022,22 @@ static SDValue LowerMSTORE(SDValue Op, c<br>
// VLX the vector should be widened to 512 bit<br>
unsigned NumEltsInWideVec = 512/VT.getScalarSizeInBits();<br>
MVT WideDataVT = MVT::getVectorVT(ScalarVT, NumEltsInWideVec);<br>
- MVT WideMaskVT = MVT::getVectorVT(MVT::i1, NumEltsInWideVec);<br>
+<br>
+ // Mask element has to be i1<br>
+ MVT MaskEltTy = Mask.getSimpleValueType().getScalarType();<br>
+ assert((MaskEltTy == MVT::i1 || VT.getVectorNumElements() <= 4) &&<br>
+ "We handle 4x32, 4x64 and 2x64 vectors only in this casse");<br>
+<br>
+ MVT WideMaskVT = MVT::getVectorVT(MaskEltTy, NumEltsInWideVec);<br>
+<br>
DataToStore = ExtendToType(DataToStore, WideDataVT, DAG);<br>
Mask = ExtendToType(Mask, WideMaskVT, DAG, true);<br>
+ if (MaskEltTy != MVT::i1)<br>
+ Mask = DAG.getNode(ISD::TRUNCATE, dl,<br>
+ MVT::getVectorVT(MVT::i1, NumEltsInWideVec), Mask);<br>
return DAG.getMaskedStore(N->getChain(), dl, DataToStore, N->getBasePtr(),<br>
Mask, N->getMemoryVT(), N->getMemOperand(),<br>
- N->isTruncatingStore());<br>
+ N->isTruncatingStore(), N->isCompressingStore());<br>
}<br>
<br>
static SDValue LowerMGATHER(SDValue Op, const X86Subtarget &Subtarget,<br>
@@ -29881,6 +29924,11 @@ static SDValue combineMaskedLoad(SDNode<br>
TargetLowering::DAGCombinerInfo &DCI,<br>
const X86Subtarget &Subtarget) {<br>
MaskedLoadSDNode *Mld = cast<MaskedLoadSDNode>(N);<br>
+<br>
+ // TODO: Expanding load with constant mask may be optimized as well.<br>
+ if (Mld->isExpandingLoad())<br>
+ return SDValue();<br>
+<br>
if (Mld->getExtensionType() == ISD::NON_EXTLOAD) {<br>
if (SDValue ScalarLoad = reduceMaskedLoadToScalarLoad(Mld, DAG, DCI))<br>
return ScalarLoad;<br>
@@ -29996,6 +30044,10 @@ static SDValue reduceMaskedStoreToScalar<br>
static SDValue combineMaskedStore(SDNode *N, SelectionDAG &DAG,<br>
const X86Subtarget &Subtarget) {<br>
MaskedStoreSDNode *Mst = cast<MaskedStoreSDNode>(N);<br>
+<br>
+ if (Mst->isCompressingStore())<br>
+ return SDValue();<br>
+<br>
if (!Mst->isTruncatingStore())<br>
return reduceMaskedStoreToScalarStore(Mst, DAG);<br>
<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td?rev=285876&r1=285875&r2=285876&view=diff" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td (original)<br>
+++ llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td Wed Nov 2 22:23:55 2016<br>
@@ -965,28 +965,23 @@ def X86mstore : PatFrag<(ops node:$src1,<br>
<br>
def masked_store_aligned128 : PatFrag<(ops node:$src1, node:$src2, node:$src3),<br>
(X86mstore node:$src1, node:$src2, node:$src3), [{<br>
- if (auto *Store = dyn_cast<MaskedStoreSDNode>(N))<br>
- return Store->getAlignment() >= 16;<br>
- return false;<br>
+ return cast<MaskedStoreSDNode>(N)->getAlignment() >= 16;<br>
}]>;<br>
<br>
def masked_store_aligned256 : PatFrag<(ops node:$src1, node:$src2, node:$src3),<br>
(X86mstore node:$src1, node:$src2, node:$src3), [{<br>
- if (auto *Store = dyn_cast<MaskedStoreSDNode>(N))<br>
- return Store->getAlignment() >= 32;<br>
- return false;<br>
+ return cast<MaskedStoreSDNode>(N)->getAlignment() >= 32;<br>
}]>;<br>
<br>
def masked_store_aligned512 : PatFrag<(ops node:$src1, node:$src2, node:$src3),<br>
(X86mstore node:$src1, node:$src2, node:$src3), [{<br>
- if (auto *Store = dyn_cast<MaskedStoreSDNode>(N))<br>
- return Store->getAlignment() >= 64;<br>
- return false;<br>
+ return cast<MaskedStoreSDNode>(N)->getAlignment() >= 64;<br>
}]>;<br>
<br>
def masked_store_unaligned : PatFrag<(ops node:$src1, node:$src2, node:$src3),<br>
- (X86mstore node:$src1, node:$src2, node:$src3), [{<br>
- return isa<MaskedStoreSDNode>(N);<br>
+ (masked_store node:$src1, node:$src2, node:$src3), [{<br>
+ return (!cast<MaskedStoreSDNode>(N)->isTruncatingStore()) &&<br>
+ (!cast<MaskedStoreSDNode>(N)->isCompressingStore());<br>
}]>;<br>
<br>
def X86mCompressingStore : PatFrag<(ops node:$src1, node:$src2, node:$src3),<br>
<br>
Added: llvm/trunk/test/CodeGen/X86/compress_expand.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/compress_expand.ll?rev=285876&view=auto" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/compress_expand.ll?rev=285876&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/X86/compress_expand.ll (added)<br>
+++ llvm/trunk/test/CodeGen/X86/compress_expand.ll Wed Nov 2 22:23:55 2016<br>
@@ -0,0 +1,247 @@<br>
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py<br>
+; RUN: llc -mattr=+avx512vl,+avx512dq,+avx512bw < %s | FileCheck %s --check-prefix=ALL --check-prefix=SKX<br>
+; RUN: llc -mattr=+avx512f < %s | FileCheck %s --check-prefix=ALL --check-prefix=KNL<br>
+<br>
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"<br>
+target triple = "x86_64-unknown-linux-gnu"<br>
+<br>
+<br>
+<br>
+define <16 x float> @test1(float* %base) {<br>
+; ALL-LABEL: test1:<br>
+; ALL: # BB#0:<br>
+; ALL-NEXT: movw $-2049, %ax # imm = 0xF7FF<br>
+; ALL-NEXT: kmovw %eax, %k1<br>
+; ALL-NEXT: vexpandps (%rdi), %zmm0 {%k1} {z}<br>
+; ALL-NEXT: retq<br>
+ %res = call <16 x float> @llvm.masked.expandload.v16f32(float* %base, <16 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 true>, <16 x float> undef)<br>
+ ret <16 x float>%res<br>
+}<br>
+<br>
+define <16 x float> @test2(float* %base, <16 x float> %src0) {<br>
+; ALL-LABEL: test2:<br>
+; ALL: # BB#0:<br>
+; ALL-NEXT: movw $30719, %ax # imm = 0x77FF<br>
+; ALL-NEXT: kmovw %eax, %k1<br>
+; ALL-NEXT: vexpandps (%rdi), %zmm0 {%k1}<br>
+; ALL-NEXT: retq<br>
+ %res = call <16 x float> @llvm.masked.expandload.v16f32(float* %base, <16 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 false>, <16 x float> %src0)<br>
+ ret <16 x float>%res<br>
+}<br>
+<br>
+define <8 x double> @test3(double* %base, <8 x double> %src0, <8 x i1> %mask) {<br>
+; SKX-LABEL: test3:<br>
+; SKX: # BB#0:<br>
+; SKX-NEXT: vpsllw $15, %xmm1, %xmm1<br>
+; SKX-NEXT: vpmovw2m %xmm1, %k1<br>
+; SKX-NEXT: vexpandpd (%rdi), %zmm0 {%k1}<br>
+; SKX-NEXT: retq<br>
+;<br>
+; KNL-LABEL: test3:<br>
+; KNL: # BB#0:<br>
+; KNL-NEXT: vpmovsxwq %xmm1, %zmm1<br>
+; KNL-NEXT: vpsllq $63, %zmm1, %zmm1<br>
+; KNL-NEXT: vptestmq %zmm1, %zmm1, %k1<br>
+; KNL-NEXT: vexpandpd (%rdi), %zmm0 {%k1}<br>
+; KNL-NEXT: retq<br>
+ %res = call <8 x double> @llvm.masked.expandload.v8f64(double* %base, <8 x i1> %mask, <8 x double> %src0)<br>
+ ret <8 x double>%res<br>
+}<br>
+<br>
+define <4 x float> @test4(float* %base, <4 x float> %src0) {<br>
+; SKX-LABEL: test4:<br>
+; SKX: # BB#0:<br>
+; SKX-NEXT: movb $7, %al<br>
+; SKX-NEXT: kmovb %eax, %k1<br>
+; SKX-NEXT: vexpandps (%rdi), %xmm0 {%k1}<br>
+; SKX-NEXT: retq<br>
+;<br>
+; KNL-LABEL: test4:<br>
+; KNL: # BB#0:<br>
+; KNL-NEXT: # kill: %XMM0<def> %XMM0<kill> %ZMM0<def><br>
+; KNL-NEXT: movw $7, %ax<br>
+; KNL-NEXT: kmovw %eax, %k1<br>
+; KNL-NEXT: vexpandps (%rdi), %zmm0 {%k1}<br>
+; KNL-NEXT: # kill: %XMM0<def> %XMM0<kill> %ZMM0<kill><br>
+; KNL-NEXT: retq<br>
+ %res = call <4 x float> @llvm.masked.expandload.v4f32(float* %base, <4 x i1> <i1 true, i1 true, i1 true, i1 false>, <4 x float> %src0)<br>
+ ret <4 x float>%res<br>
+}<br>
+<br>
+define <2 x i64> @test5(i64* %base, <2 x i64> %src0) {<br>
+; SKX-LABEL: test5:<br>
+; SKX: # BB#0:<br>
+; SKX-NEXT: movb $2, %al<br>
+; SKX-NEXT: kmovb %eax, %k1<br>
+; SKX-NEXT: vpexpandq (%rdi), %xmm0 {%k1}<br>
+; SKX-NEXT: retq<br>
+;<br>
+; KNL-LABEL: test5:<br>
+; KNL: # BB#0:<br>
+; KNL-NEXT: # kill: %XMM0<def> %XMM0<kill> %ZMM0<def><br>
+; KNL-NEXT: movb $2, %al<br>
+; KNL-NEXT: kmovw %eax, %k1<br>
+; KNL-NEXT: vpexpandq (%rdi), %zmm0 {%k1}<br>
+; KNL-NEXT: # kill: %XMM0<def> %XMM0<kill> %ZMM0<kill><br>
+; KNL-NEXT: retq<br>
+ %res = call <2 x i64> @llvm.masked.expandload.v2i64(i64* %base, <2 x i1> <i1 false, i1 true>, <2 x i64> %src0)<br>
+ ret <2 x i64>%res<br>
+}<br>
+<br>
+declare <16 x float> @llvm.masked.expandload.v16f32(float*, <16 x i1>, <16 x float>)<br>
+declare <8 x double> @llvm.masked.expandload.v8f64(double*, <8 x i1>, <8 x double>)<br>
+declare <4 x float> @llvm.masked.expandload.v4f32(float*, <4 x i1>, <4 x float>)<br>
+declare <2 x i64> @llvm.masked.expandload.v2i64(i64*, <2 x i1>, <2 x i64>)<br>
+<br>
+define void @test6(float* %base, <16 x float> %V) {<br>
+; ALL-LABEL: test6:<br>
+; ALL: # BB#0:<br>
+; ALL-NEXT: movw $-2049, %ax # imm = 0xF7FF<br>
+; ALL-NEXT: kmovw %eax, %k1<br>
+; ALL-NEXT: vcompressps %zmm0, (%rdi) {%k1}<br>
+; ALL-NEXT: retq<br>
+ call void @llvm.masked.compressstore.v16f32(<16 x float> %V, float* %base, <16 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 true, i1 true>)<br>
+ ret void<br>
+}<br>
+<br>
+define void @test7(float* %base, <8 x float> %V, <8 x i1> %mask) {<br>
+; SKX-LABEL: test7:<br>
+; SKX: # BB#0:<br>
+; SKX-NEXT: vpsllw $15, %xmm1, %xmm1<br>
+; SKX-NEXT: vpmovw2m %xmm1, %k1<br>
+; SKX-NEXT: vcompressps %ymm0, (%rdi) {%k1}<br>
+; SKX-NEXT: retq<br>
+;<br>
+; KNL-LABEL: test7:<br>
+; KNL: # BB#0:<br>
+; KNL-NEXT: # kill: %YMM0<def> %YMM0<kill> %ZMM0<def><br>
+; KNL-NEXT: vpmovsxwq %xmm1, %zmm1<br>
+; KNL-NEXT: vpsllq $63, %zmm1, %zmm1<br>
+; KNL-NEXT: vptestmq %zmm1, %zmm1, %k0<br>
+; KNL-NEXT: kshiftlw $8, %k0, %k0<br>
+; KNL-NEXT: kshiftrw $8, %k0, %k1<br>
+; KNL-NEXT: vcompressps %zmm0, (%rdi) {%k1}<br>
+; KNL-NEXT: retq<br>
+ call void @llvm.masked.compressstore.v8f32(<8 x float> %V, float* %base, <8 x i1> %mask)<br>
+ ret void<br>
+}<br>
+<br>
+define void @test8(double* %base, <8 x double> %V, <8 x i1> %mask) {<br>
+; SKX-LABEL: test8:<br>
+; SKX: # BB#0:<br>
+; SKX-NEXT: vpsllw $15, %xmm1, %xmm1<br>
+; SKX-NEXT: vpmovw2m %xmm1, %k1<br>
+; SKX-NEXT: vcompresspd %zmm0, (%rdi) {%k1}<br>
+; SKX-NEXT: retq<br>
+;<br>
+; KNL-LABEL: test8:<br>
+; KNL: # BB#0:<br>
+; KNL-NEXT: vpmovsxwq %xmm1, %zmm1<br>
+; KNL-NEXT: vpsllq $63, %zmm1, %zmm1<br>
+; KNL-NEXT: vptestmq %zmm1, %zmm1, %k1<br>
+; KNL-NEXT: vcompresspd %zmm0, (%rdi) {%k1}<br>
+; KNL-NEXT: retq<br>
+ call void @llvm.masked.compressstore.v8f64(<8 x double> %V, double* %base, <8 x i1> %mask)<br>
+ ret void<br>
+}<br>
+<br>
+define void @test9(i64* %base, <8 x i64> %V, <8 x i1> %mask) {<br>
+; SKX-LABEL: test9:<br>
+; SKX: # BB#0:<br>
+; SKX-NEXT: vpsllw $15, %xmm1, %xmm1<br>
+; SKX-NEXT: vpmovw2m %xmm1, %k1<br>
+; SKX-NEXT: vpcompressq %zmm0, (%rdi) {%k1}<br>
+; SKX-NEXT: retq<br>
+;<br>
+; KNL-LABEL: test9:<br>
+; KNL: # BB#0:<br>
+; KNL-NEXT: vpmovsxwq %xmm1, %zmm1<br>
+; KNL-NEXT: vpsllq $63, %zmm1, %zmm1<br>
+; KNL-NEXT: vptestmq %zmm1, %zmm1, %k1<br>
+; KNL-NEXT: vpcompressq %zmm0, (%rdi) {%k1}<br>
+; KNL-NEXT: retq<br>
+ call void @llvm.masked.compressstore.v8i64(<8 x i64> %V, i64* %base, <8 x i1> %mask)<br>
+ ret void<br>
+}<br>
+<br>
+define void @test10(i64* %base, <4 x i64> %V, <4 x i1> %mask) {<br>
+; SKX-LABEL: test10:<br>
+; SKX: # BB#0:<br>
+; SKX-NEXT: vpslld $31, %xmm1, %xmm1<br>
+; SKX-NEXT: vptestmd %xmm1, %xmm1, %k1<br>
+; SKX-NEXT: vpcompressq %ymm0, (%rdi) {%k1}<br>
+; SKX-NEXT: retq<br>
+;<br>
+; KNL-LABEL: test10:<br>
+; KNL: # BB#0:<br>
+; KNL-NEXT: # kill: %YMM0<def> %YMM0<kill> %ZMM0<def><br>
+; KNL-NEXT: vpslld $31, %xmm1, %xmm1<br>
+; KNL-NEXT: vpsrad $31, %xmm1, %xmm1<br>
+; KNL-NEXT: vpmovsxdq %xmm1, %ymm1<br>
+; KNL-NEXT: vpxord %zmm2, %zmm2, %zmm2<br>
+; KNL-NEXT: vinserti64x4 $0, %ymm1, %zmm2, %zmm1<br>
+; KNL-NEXT: vpsllq $63, %zmm1, %zmm1<br>
+; KNL-NEXT: vptestmq %zmm1, %zmm1, %k1<br>
+; KNL-NEXT: vpcompressq %zmm0, (%rdi) {%k1}<br>
+; KNL-NEXT: retq<br>
+ call void @llvm.masked.compressstore.v4i64(<4 x i64> %V, i64* %base, <4 x i1> %mask)<br>
+ ret void<br>
+}<br>
+<br>
+define void @test11(i64* %base, <2 x i64> %V, <2 x i1> %mask) {<br>
+; SKX-LABEL: test11:<br>
+; SKX: # BB#0:<br>
+; SKX-NEXT: vpsllq $63, %xmm1, %xmm1<br>
+; SKX-NEXT: vptestmq %xmm1, %xmm1, %k1<br>
+; SKX-NEXT: vpcompressq %xmm0, (%rdi) {%k1}<br>
+; SKX-NEXT: retq<br>
+;<br>
+; KNL-LABEL: test11:<br>
+; KNL: # BB#0:<br>
+; KNL-NEXT: # kill: %XMM0<def> %XMM0<kill> %ZMM0<def><br>
+; KNL-NEXT: vpsllq $63, %xmm1, %xmm1<br>
+; KNL-NEXT: vpsrad $31, %xmm1, %xmm1<br>
+; KNL-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[1,1,3,3]<br>
+; KNL-NEXT: vpxord %zmm2, %zmm2, %zmm2<br>
+; KNL-NEXT: vinserti32x4 $0, %xmm1, %zmm2, %zmm1<br>
+; KNL-NEXT: vpsllq $63, %zmm1, %zmm1<br>
+; KNL-NEXT: vptestmq %zmm1, %zmm1, %k1<br>
+; KNL-NEXT: vpcompressq %zmm0, (%rdi) {%k1}<br>
+; KNL-NEXT: retq<br>
+ call void @llvm.masked.compressstore.v2i64(<2 x i64> %V, i64* %base, <2 x i1> %mask)<br>
+ ret void<br>
+}<br>
+<br>
+define void @test12(float* %base, <4 x float> %V, <4 x i1> %mask) {<br>
+; SKX-LABEL: test12:<br>
+; SKX: # BB#0:<br>
+; SKX-NEXT: vpslld $31, %xmm1, %xmm1<br>
+; SKX-NEXT: vptestmd %xmm1, %xmm1, %k1<br>
+; SKX-NEXT: vcompressps %xmm0, (%rdi) {%k1}<br>
+; SKX-NEXT: retq<br>
+;<br>
+; KNL-LABEL: test12:<br>
+; KNL: # BB#0:<br>
+; KNL-NEXT: # kill: %XMM0<def> %XMM0<kill> %ZMM0<def><br>
+; KNL-NEXT: vpslld $31, %xmm1, %xmm1<br>
+; KNL-NEXT: vpsrad $31, %xmm1, %xmm1<br>
+; KNL-NEXT: vpxord %zmm2, %zmm2, %zmm2<br>
+; KNL-NEXT: vinserti32x4 $0, %xmm1, %zmm2, %zmm1<br>
+; KNL-NEXT: vpslld $31, %zmm1, %zmm1<br>
+; KNL-NEXT: vptestmd %zmm1, %zmm1, %k1<br>
+; KNL-NEXT: vcompressps %zmm0, (%rdi) {%k1}<br>
+; KNL-NEXT: retq<br>
+ call void @llvm.masked.compressstore.v4f32(<4 x float> %V, float* %base, <4 x i1> %mask)<br>
+ ret void<br>
+}<br>
+<br>
+declare void @llvm.masked.compressstore.v16f32(<16 x float>, float* , <16 x i1>)<br>
+declare void @llvm.masked.compressstore.v8f32(<8 x float>, float* , <8 x i1>)<br>
+declare void @llvm.masked.compressstore.v8f64(<8 x double>, double* , <8 x i1>)<br>
+declare void @llvm.masked.compressstore.v16i32(<16 x i32>, i32* , <16 x i1>)<br>
+declare void @llvm.masked.compressstore.v8i32(<8 x i32>, i32* , <8 x i1>)<br>
+declare void @llvm.masked.compressstore.v8i64(<8 x i64>, i64* , <8 x i1>)<br>
+declare void @llvm.masked.compressstore.v4i32(<4 x i32>, i32* , <4 x i1>)<br>
+declare void @llvm.masked.compressstore.v4f32(<4 x float>, float* , <4 x i1>)<br>
+declare void @llvm.masked.compressstore.v4i64(<4 x i64>, i64* , <4 x i1>)<br>
+declare void @llvm.masked.compressstore.v2i64(<2 x i64>, i64* , <2 x i1>)<br>
<br>
Modified: llvm/trunk/utils/TableGen/CodeGenTarget.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenTarget.cpp?rev=285876&r1=285875&r2=285876&view=diff" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenTarget.cpp?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/utils/TableGen/CodeGenTarget.cpp (original)<br>
+++ llvm/trunk/utils/TableGen/CodeGenTarget.cpp Wed Nov 2 22:23:55 2016<br>
@@ -550,8 +550,7 @@ CodeGenIntrinsic::CodeGenIntrinsic(Recor<br>
// overloaded, all the types can be specified directly.<br>
assert(((!TyEl->isSubClassOf("LLVMExtendedType") &&<br>
!TyEl->isSubClassOf("LLVMTruncatedType") &&<br>
- !TyEl->isSubClassOf("LLVMVectorSameWidth") &&<br>
- !TyEl->isSubClassOf("LLVMPointerToElt")) ||<br>
+ !TyEl->isSubClassOf("LLVMVectorSameWidth")) ||<br>
VT == MVT::iAny || VT == MVT::vAny) &&<br>
"Expected iAny or vAny type");<br>
} else<br>
<br>
Modified: llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp?rev=285876&r1=285875&r2=285876&view=diff" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp (original)<br>
+++ llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp Wed Nov 2 22:23:55 2016<br>
@@ -213,10 +213,11 @@ enum IIT_Info {<br>
IIT_HALF_VEC_ARG = 30,<br>
IIT_SAME_VEC_WIDTH_ARG = 31,<br>
IIT_PTR_TO_ARG = 32,<br>
- IIT_VEC_OF_PTRS_TO_ELT = 33,<br>
- IIT_I128 = 34,<br>
- IIT_V512 = 35,<br>
- IIT_V1024 = 36<br>
+ IIT_PTR_TO_ELT = 33,<br>
+ IIT_VEC_OF_PTRS_TO_ELT = 34,<br>
+ IIT_I128 = 35,<br>
+ IIT_V512 = 36,<br>
+ IIT_V1024 = 37<br>
};<br>
<br>
<br>
@@ -277,6 +278,8 @@ static void EncodeFixedType(Record *R, s<br>
Sig.push_back(IIT_PTR_TO_ARG);<br>
else if (R->isSubClassOf("LLVMVectorOfPointersToElt"))<br>
Sig.push_back(IIT_VEC_OF_PTRS_TO_ELT);<br>
+ else if (R->isSubClassOf("LLVMPointerToElt"))<br>
+ Sig.push_back(IIT_PTR_TO_ELT);<br>
else<br>
Sig.push_back(IIT_ARG);<br>
return Sig.push_back((Number << 3) | ArgCodes[Number]);<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><o:p></o:p></p>
</blockquote>
</div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt"><o:p> </o:p></p>
</div>
</div>
<p>---------------------------------------------------------------------<br>
Intel Israel (74) Limited</p>
<p>This e-mail and any attachments may contain confidential material for<br>
the sole use of the intended recipient(s). Any review or distribution<br>
by others is strictly prohibited. If you are not the intended<br>
recipient, please contact the sender and delete all copies.</p></body>
</html>