<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p>I would argue that the semantics in LangRef should actually
      precede the codegen support if anything.  Having the documentation
      include a "implementation in progress" disclaimer is fine; having
      an implementation which isn't documented is less so.</p>
    <p>Philip<br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 11/02/2016 10:41 PM, Demikhovsky,
      Elena via llvm-commits wrote:<br>
    </div>
    <blockquote
cite="mid:A0DC88CEB3010344830D52D66533DA8E40659D5F@hasmsx108.ger.corp.intel.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <meta name="Generator" content="Microsoft Word 15 (filtered
        medium)">
      <style><!--
/* Font Definitions */
@font-face
        {font-family:Wingdings;
        panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:Wingdings;
        panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri",sans-serif;}
.MsoPapDefault
        {mso-style-type:export-only;
        margin-left:46.2pt;
        text-indent:-17.85pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:799957587;
        mso-list-type:hybrid;
        mso-list-template-ids:-1214880770 44585902 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
        {mso-level-start-at:0;
        mso-level-number-format:bullet;
        mso-level-text:-;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:"Calibri",sans-serif;
        mso-fareast-font-family:"Times New Roman";}
@list l0:level2
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l0:level3
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Wingdings;}
@list l0:level4
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Symbol;}
@list l0:level5
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l0:level6
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Wingdings;}
@list l0:level7
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Symbol;}
@list l0:level8
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l0:level9
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Wingdings;}
ol
        {margin-bottom:0cm;}
ul
        {margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
      <div class="WordSection1">
        <p class="MsoNormal"><a moz-do-not-send="true"
            name="_MailEndCompose">> What semantics do these
            intrinsics have, should they be added to the langref?</a><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p></o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Definitely,
            I’m going to do this in one of the next commits. I just have
            to complete the codegen part, otherwise intrinsics that
            appear in LangRef will not be fully supported. <o:p></o:p></span></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
        <p class="MsoNormal"
          style="margin-left:36.0pt;text-indent:-18.0pt;mso-list:l0
          level1 lfo1">
          <!--[if !supportLists]--><span
            style="font-family:"Calibri",sans-serif;color:#2F5496"><span
              style="mso-list:Ignore">-<span style="font:7.0pt
                "Times New Roman"">         
              </span></span></span><!--[endif]--><span dir="LTR"></span><b><i><span
                style="color:#2F5496"> Elena<o:p></o:p></span></i></b></p>
        <p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
        <p class="MsoNormal"><a moz-do-not-send="true"
            name="_____replyseparator"></a><b><span
              style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif">
            David Majnemer [<a class="moz-txt-link-freetext" href="mailto:david.majnemer@gmail.com">mailto:david.majnemer@gmail.com</a>]
            <br>
            <b>Sent:</b> Thursday, November 03, 2016 05:55<br>
            <b>To:</b> Demikhovsky, Elena
            <a class="moz-txt-link-rfc2396E" href="mailto:elena.demikhovsky@intel.com"><elena.demikhovsky@intel.com></a><br>
            <b>Cc:</b> llvm-commits <a class="moz-txt-link-rfc2396E" href="mailto:llvm-commits@lists.llvm.org"><llvm-commits@lists.llvm.org></a><br>
            <b>Subject:</b> Re: [llvm] r285876 - Expandload and
            Compressstore intrinsics<o:p></o:p></span></p>
        <p class="MsoNormal"><o:p> </o:p></p>
        <div>
          <p class="MsoNormal"
            style="margin-left:46.2pt;text-indent:-17.85pt">What
            semantics do these intrinsics have, should they be added to
            the langref?<o:p></o:p></p>
        </div>
        <div>
          <p class="MsoNormal"
            style="margin-left:46.2pt;text-indent:-17.85pt"><o:p> </o:p></p>
          <div>
            <p class="MsoNormal"
              style="margin-left:46.2pt;text-indent:-17.85pt">On Wed,
              Nov 2, 2016 at 8:23 PM, Elena Demikhovsky via llvm-commits
              <<a moz-do-not-send="true"
                href="mailto:llvm-commits@lists.llvm.org"
                target="_blank">llvm-commits@lists.llvm.org</a>>
              wrote:<o:p></o:p></p>
            <blockquote style="border:none;border-left:solid #CCCCCC
              1.0pt;padding:0cm 0cm 0cm
              6.0pt;margin-left:4.8pt;margin-right:0cm">
              <p class="MsoNormal"
                style="margin-left:46.2pt;text-indent:-17.85pt">Author:
                delena<br>
                Date: Wed Nov  2 22:23:55 2016<br>
                New Revision: 285876<br>
                <br>
                URL: <a moz-do-not-send="true"
                  href="http://llvm.org/viewvc/llvm-project?rev=285876&view=rev"
                  target="_blank">
http://llvm.org/viewvc/llvm-project?rev=285876&view=rev</a><br>
                Log:<br>
                Expandload and Compressstore intrinsics<br>
                <br>
                2 new intrinsics covering AVX-512 compress/expand
                functionality.<br>
                This implementation includes syntax, DAG builder,
                operation lowering and tests.<br>
                Does not include: handling of illegal data types,
                codegen prepare pass and the cost model.<br>
                <br>
                <br>
                Added:<br>
                    llvm/trunk/test/CodeGen/X86/compress_expand.ll<br>
                Modified:<br>
                    llvm/trunk/include/llvm/IR/Intrinsics.h<br>
                    llvm/trunk/include/llvm/IR/Intrinsics.td<br>
                    llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp<br>
                   
                llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp<br>
                   
                llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp<br>
                   
                llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h<br>
                    llvm/trunk/lib/IR/Function.cpp<br>
                    llvm/trunk/lib/Target/X86/X86ISelLowering.cpp<br>
                    llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td<br>
                    llvm/trunk/utils/TableGen/CodeGenTarget.cpp<br>
                    llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp<br>
                <br>
                Modified: llvm/trunk/include/llvm/IR/Intrinsics.h<br>
                URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Intrinsics.h?rev=285876&r1=285875&r2=285876&view=diff"
                  target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Intrinsics.h?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
                --- llvm/trunk/include/llvm/IR/Intrinsics.h (original)<br>
                +++ llvm/trunk/include/llvm/IR/Intrinsics.h Wed Nov  2
                22:23:55 2016<br>
                @@ -100,7 +100,7 @@ namespace Intrinsic {<br>
                       Void, VarArg, MMX, Token, Metadata, Half, Float,
                Double,<br>
                       Integer, Vector, Pointer, Struct,<br>
                       Argument, ExtendArgument, TruncArgument,
                HalfVecArgument,<br>
                -      SameVecWidthArgument, PtrToArgument,
                VecOfPtrsToElt<br>
                +      SameVecWidthArgument, PtrToArgument, PtrToElt,
                VecOfPtrsToElt<br>
                     } Kind;<br>
                <br>
                     union {<br>
                @@ -123,7 +123,7 @@ namespace Intrinsic {<br>
                       assert(Kind == Argument || Kind == ExtendArgument
                ||<br>
                              Kind == TruncArgument || Kind ==
                HalfVecArgument ||<br>
                              Kind == SameVecWidthArgument || Kind ==
                PtrToArgument ||<br>
                -             Kind == VecOfPtrsToElt);<br>
                +             Kind == PtrToElt || Kind ==
                VecOfPtrsToElt);<br>
                       return Argument_Info >> 3;<br>
                     }<br>
                     ArgKind getArgumentKind() const {<br>
                <br>
                Modified: llvm/trunk/include/llvm/IR/Intrinsics.td<br>
                URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Intrinsics.td?rev=285876&r1=285875&r2=285876&view=diff"
                  target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Intrinsics.td?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
                --- llvm/trunk/include/llvm/IR/Intrinsics.td (original)<br>
                +++ llvm/trunk/include/llvm/IR/Intrinsics.td Wed Nov  2
                22:23:55 2016<br>
                @@ -133,6 +133,7 @@ class LLVMVectorSameWidth<int
                num, LLVMT<br>
                   ValueType ElTy = elty.VT;<br>
                 }<br>
                 class LLVMPointerTo<int num> :
                LLVMMatchType<num>;<br>
                +class LLVMPointerToElt<int num> :
                LLVMMatchType<num>;<br>
                 class LLVMVectorOfPointersToElt<int num> :
                LLVMMatchType<num>;<br>
                <br>
                 // Match the type of another intrinsic parameter that
                is expected to be a<br>
                @@ -718,13 +719,25 @@ def int_masked_gather:
                Intrinsic<[llvm_a<br>
                                                 
                [LLVMVectorOfPointersToElt<0>, llvm_i32_ty,<br>
                                                 
                 LLVMVectorSameWidth<0, llvm_i1_ty>,<br>
                                                 
                 LLVMMatchType<0>],<br>
                -                                 [IntrReadMem]>;<br>
                +                                 [IntrReadMem]>;<br>
                <br>
                 def int_masked_scatter: Intrinsic<[],<br>
                                                   [llvm_anyvector_ty,<br>
                                                   
                LLVMVectorOfPointersToElt<0>, llvm_i32_ty,<br>
                                                   
                LLVMVectorSameWidth<0, llvm_i1_ty>]>;<br>
                <br>
                +def int_masked_expandload:
                Intrinsic<[llvm_anyvector_ty],<br>
                +                                   
                 [LLVMPointerToElt<0>,<br>
                +                                     
                LLVMVectorSameWidth<0, llvm_i1_ty>,<br>
                +                                     
                LLVMMatchType<0>],<br>
                +                                     [IntrReadMem]>;<br>
                +<br>
                +def int_masked_compressstore: Intrinsic<[],<br>
                +                                   
                 [llvm_anyvector_ty,<br>
                +                                     
                LLVMPointerToElt<0>,<br>
                +                                     
                LLVMVectorSameWidth<0, llvm_i1_ty>],<br>
                +                                   
                 [IntrArgMemOnly]>;<br>
                +<br>
                 // Test whether a pointer is associated with a type
                metadata identifier.<br>
                 def int_type_test : Intrinsic<[llvm_i1_ty],
                [llvm_ptr_ty, llvm_metadata_ty],<br>
                                               [IntrNoMem]>;<br>
                <br>
                Modified:
                llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp<br>
                URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=285876&r1=285875&r2=285876&view=diff"
                  target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
                --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
                (original)<br>
                +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
                Wed Nov  2 22:23:55 2016<br>
                @@ -5583,7 +5583,7 @@ SDValue
                DAGCombiner::visitMSTORE(SDNode<br>
                                            Alignment,
                MST->getAAInfo(), MST->getRanges());<br>
                <br>
                     Lo = DAG.getMaskedStore(Chain, DL, DataLo, Ptr,
                MaskLo, LoMemVT, MMO,<br>
                -                           
                MST->isTruncatingStore());<br>
                +                           
                MST->isTruncatingStore(),
                MST->isCompressingStore());<br>
                <br>
                     unsigned IncrementSize = LoMemVT.getSizeInBits()/8;<br>
                     Ptr = DAG.getNode(ISD::ADD, DL, Ptr.getValueType(),
                Ptr,<br>
                @@ -5596,7 +5596,7 @@ SDValue
                DAGCombiner::visitMSTORE(SDNode<br>
                                            MST->getRanges());<br>
                <br>
                     Hi = DAG.getMaskedStore(Chain, DL, DataHi, Ptr,
                MaskHi, HiMemVT, MMO,<br>
                -                           
                MST->isTruncatingStore());<br>
                +                           
                MST->isTruncatingStore(),
                MST->isCompressingStore());<br>
                <br>
                     AddToWorklist(Lo.getNode());<br>
                     AddToWorklist(Hi.getNode());<br>
                <br>
                Modified:
                llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp<br>
                URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp?rev=285876&r1=285875&r2=285876&view=diff"
                  target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
                ---
                llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
                (original)<br>
                +++
                llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
                Wed Nov  2 22:23:55 2016<br>
                @@ -1212,7 +1212,7 @@ SDValue
                DAGTypeLegalizer::PromoteIntOp_M<br>
                <br>
                   return DAG.getMaskedStore(N->getChain(), dl,
                DataOp, N->getBasePtr(), Mask,<br>
                                             N->getMemoryVT(),
                N->getMemOperand(),<br>
                -                            TruncateStore);<br>
                +                            TruncateStore,
                N->isCompressingStore());<br>
                 }<br>
                <br>
                 SDValue
                DAGTypeLegalizer::PromoteIntOp_MLOAD(MaskedLoadSDNode
                *N,<br>
                <br>
                Modified:
                llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp<br>
                URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp?rev=285876&r1=285875&r2=285876&view=diff"
                  target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
                ---
                llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
                (original)<br>
                +++
                llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
                Wed Nov  2 22:23:55 2016<br>
                @@ -3667,16 +3667,39 @@ void
                SelectionDAGBuilder::visitStore(con<br>
                   DAG.setRoot(StoreNode);<br>
                 }<br>
                <br>
                -void SelectionDAGBuilder::visitMaskedStore(const
                CallInst &I) {<br>
                +void SelectionDAGBuilder::visitMaskedStore(const
                CallInst &I,<br>
                +                                           bool
                IsCompressing) {<br>
                   SDLoc sdl = getCurSDLoc();<br>
                <br>
                -  // llvm.masked.store.*(Src0, Ptr, alignment, Mask)<br>
                -  Value  *PtrOperand = I.getArgOperand(1);<br>
                +  auto getMaskedStoreOps = [&](Value* &Ptr,
                Value* &Mask, Value* &Src0,<br>
                +                           unsigned& Alignment) {<br>
                +    // llvm.masked.store.*(Src0, Ptr, alignment, Mask)<br>
                +    Src0 = I.getArgOperand(0);<br>
                +    Ptr = I.getArgOperand(1);<br>
                +    Alignment =
                cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();<br>
                +    Mask = I.getArgOperand(3);<br>
                +  };<br>
                +  auto getCompressingStoreOps = [&](Value*
                &Ptr, Value* &Mask, Value* &Src0,<br>
                +                           unsigned& Alignment) {<br>
                +    // llvm.masked.compressstore.*(Src0, Ptr, Mask)<br>
                +    Src0 = I.getArgOperand(0);<br>
                +    Ptr = I.getArgOperand(1);<br>
                +    Mask = I.getArgOperand(2);<br>
                +    Alignment = 0;<br>
                +  };<br>
                +<br>
                +  Value  *PtrOperand, *MaskOperand, *Src0Operand;<br>
                +  unsigned Alignment;<br>
                +  if (IsCompressing)<br>
                +    getCompressingStoreOps(PtrOperand, MaskOperand,
                Src0Operand, Alignment);<br>
                +  else<br>
                +    getMaskedStoreOps(PtrOperand, MaskOperand,
                Src0Operand, Alignment);<br>
                +<br>
                   SDValue Ptr = getValue(PtrOperand);<br>
                -  SDValue Src0 = getValue(I.getArgOperand(0));<br>
                -  SDValue Mask = getValue(I.getArgOperand(3));<br>
                +  SDValue Src0 = getValue(Src0Operand);<br>
                +  SDValue Mask = getValue(MaskOperand);<br>
                +<br>
                   EVT VT = Src0.getValueType();<br>
                -  unsigned Alignment =
                (cast<ConstantInt>(I.getArgOperand(2)))->getZExtValue();<br>
                   if (!Alignment)<br>
                     Alignment = DAG.getEVTAlignment(VT);<br>
                <br>
                @@ -3689,7 +3712,8 @@ void
                SelectionDAGBuilder::visitMaskedSto<br>
                                           MachineMemOperand::MOStore, 
                VT.getStoreSize(),<br>
                                           Alignment, AAInfo);<br>
                   SDValue StoreNode = DAG.getMaskedStore(getRoot(),
                sdl, Src0, Ptr, Mask, VT,<br>
                -                                         MMO, false);<br>
                +                                         MMO, false /*
                Truncating */,<br>
                +                                       
                 IsCompressing);<br>
                   DAG.setRoot(StoreNode);<br>
                   setValue(&I, StoreNode);<br>
                 }<br>
                @@ -3710,7 +3734,7 @@ void
                SelectionDAGBuilder::visitMaskedSto<br>
                 // extract the spalt value and use it as a uniform
                base.<br>
                 // In all other cases the function returns 'false'.<br>
                 //<br>
                -static bool getUniformBase(const Value *& Ptr,
                SDValue& Base, SDValue& Index,<br>
                +static bool getUniformBase(const Value* &Ptr,
                SDValue& Base, SDValue& Index,<br>
                                            SelectionDAGBuilder* SDB) {<br>
                <br>
                   SelectionDAG& DAG = SDB->DAG;<br>
                @@ -3790,18 +3814,38 @@ void
                SelectionDAGBuilder::visitMaskedSca<br>
                   setValue(&I, Scatter);<br>
                 }<br>
                <br>
                -void SelectionDAGBuilder::visitMaskedLoad(const
                CallInst &I) {<br>
                +void SelectionDAGBuilder::visitMaskedLoad(const
                CallInst &I, bool IsExpanding) {<br>
                   SDLoc sdl = getCurSDLoc();<br>
                <br>
                -  // @llvm.masked.load.*(Ptr, alignment, Mask, Src0)<br>
                -  Value  *PtrOperand = I.getArgOperand(0);<br>
                +  auto getMaskedLoadOps = [&](Value* &Ptr,
                Value* &Mask, Value* &Src0,<br>
                +                           unsigned& Alignment) {<br>
                +    // @llvm.masked.load.*(Ptr, alignment, Mask, Src0)<br>
                +    Ptr = I.getArgOperand(0);<br>
                +    Alignment =
                cast<ConstantInt>(I.getArgOperand(1))->getZExtValue();<br>
                +    Mask = I.getArgOperand(2);<br>
                +    Src0 = I.getArgOperand(3);<br>
                +  };<br>
                +  auto getExpandingLoadOps = [&](Value* &Ptr,
                Value* &Mask, Value* &Src0,<br>
                +                           unsigned& Alignment) {<br>
                +    // @llvm.masked.expandload.*(Ptr, Mask, Src0)<br>
                +    Ptr = I.getArgOperand(0);<br>
                +    Alignment = 0;<br>
                +    Mask = I.getArgOperand(1);<br>
                +    Src0 = I.getArgOperand(2);<br>
                +  };<br>
                +<br>
                +  Value  *PtrOperand, *MaskOperand, *Src0Operand;<br>
                +  unsigned Alignment;<br>
                +  if (IsExpanding)<br>
                +    getExpandingLoadOps(PtrOperand, MaskOperand,
                Src0Operand, Alignment);<br>
                +  else<br>
                +    getMaskedLoadOps(PtrOperand, MaskOperand,
                Src0Operand, Alignment);<br>
                +<br>
                   SDValue Ptr = getValue(PtrOperand);<br>
                -  SDValue Src0 = getValue(I.getArgOperand(3));<br>
                -  SDValue Mask = getValue(I.getArgOperand(2));<br>
                +  SDValue Src0 = getValue(Src0Operand);<br>
                +  SDValue Mask = getValue(MaskOperand);<br>
                <br>
                -  const TargetLowering &TLI =
                DAG.getTargetLoweringInfo();<br>
                -  EVT VT = TLI.getValueType(DAG.getDataLayout(),
                I.getType());<br>
                -  unsigned Alignment =
                (cast<ConstantInt>(I.getArgOperand(1)))->getZExtValue();<br>
                +  EVT VT = Src0.getValueType();<br>
                   if (!Alignment)<br>
                     Alignment = DAG.getEVTAlignment(VT);<br>
                <br>
                @@ -3821,7 +3865,7 @@ void
                SelectionDAGBuilder::visitMaskedLoa<br>
                                           Alignment, AAInfo, Ranges);<br>
                <br>
                   SDValue Load = DAG.getMaskedLoad(VT, sdl, InChain,
                Ptr, Mask, Src0, VT, MMO,<br>
                -                                   ISD::NON_EXTLOAD,
                false);<br>
                +                                   ISD::NON_EXTLOAD,
                IsExpanding);<br>
                   if (AddToChain) {<br>
                     SDValue OutChain = Load.getValue(1);<br>
                     DAG.setRoot(OutChain);<br>
                @@ -5054,6 +5098,12 @@
                SelectionDAGBuilder::visitIntrinsicCall(<br>
                   case Intrinsic::masked_store:<br>
                     visitMaskedStore(I);<br>
                     return nullptr;<br>
                +  case Intrinsic::masked_expandload:<br>
                +    visitMaskedLoad(I, true /* IsExpanding */);<br>
                +    return nullptr;<br>
                +  case Intrinsic::masked_compressstore:<br>
                +    visitMaskedStore(I, true /* IsCompressing */);<br>
                +    return nullptr;<br>
                   case Intrinsic::x86_mmx_pslli_w:<br>
                   case Intrinsic::x86_mmx_pslli_d:<br>
                   case Intrinsic::x86_mmx_pslli_q:<br>
                <br>
                Modified:
                llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h<br>
                URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h?rev=285876&r1=285875&r2=285876&view=diff"
                  target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
                ---
                llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
                (original)<br>
                +++
                llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
                Wed Nov  2 22:23:55 2016<br>
                @@ -874,8 +874,8 @@ private:<br>
                   void visitAlloca(const AllocaInst &I);<br>
                   void visitLoad(const LoadInst &I);<br>
                   void visitStore(const StoreInst &I);<br>
                -  void visitMaskedLoad(const CallInst &I);<br>
                -  void visitMaskedStore(const CallInst &I);<br>
                +  void visitMaskedLoad(const CallInst &I, bool
                IsExpanding = false);<br>
                +  void visitMaskedStore(const CallInst &I, bool
                IsCompressing = false);<br>
                   void visitMaskedGather(const CallInst &I);<br>
                   void visitMaskedScatter(const CallInst &I);<br>
                   void visitAtomicCmpXchg(const AtomicCmpXchgInst
                &I);<br>
                <br>
                Modified: llvm/trunk/lib/IR/Function.cpp<br>
                URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/Function.cpp?rev=285876&r1=285875&r2=285876&view=diff"
                  target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/IR/Function.cpp?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
                --- llvm/trunk/lib/IR/Function.cpp (original)<br>
                +++ llvm/trunk/lib/IR/Function.cpp Wed Nov  2 22:23:55
                2016<br>
                @@ -607,10 +607,11 @@ enum IIT_Info {<br>
                   IIT_HALF_VEC_ARG = 30,<br>
                   IIT_SAME_VEC_WIDTH_ARG = 31,<br>
                   IIT_PTR_TO_ARG = 32,<br>
                -  IIT_VEC_OF_PTRS_TO_ELT = 33,<br>
                -  IIT_I128 = 34,<br>
                -  IIT_V512 = 35,<br>
                -  IIT_V1024 = 36<br>
                +  IIT_PTR_TO_ELT = 33,<br>
                +  IIT_VEC_OF_PTRS_TO_ELT = 34,<br>
                +  IIT_I128 = 35,<br>
                +  IIT_V512 = 36,<br>
                +  IIT_V1024 = 37<br>
                 };<br>
                <br>
                <br>
                @@ -744,6 +745,11 @@ static void DecodeIITType(unsigned
                &Next<br>
                                                              ArgInfo));<br>
                     return;<br>
                   }<br>
                +  case IIT_PTR_TO_ELT: {<br>
                +    unsigned ArgInfo = (NextElt == Infos.size() ? 0 :
                Infos[NextElt++]);<br>
                +   
                OutputTable.push_back(IITDescriptor::get(IITDescriptor::PtrToElt,
                ArgInfo));<br>
                +    return;<br>
                +  }<br>
                   case IIT_VEC_OF_PTRS_TO_ELT: {<br>
                     unsigned ArgInfo = (NextElt == Infos.size() ? 0 :
                Infos[NextElt++]);<br>
                   
                 OutputTable.push_back(IITDescriptor::get(IITDescriptor::VecOfPtrsToElt,<br>
                @@ -870,6 +876,14 @@ static Type
                *DecodeFixedType(ArrayRef<In<br>
                     Type *Ty = Tys[D.getArgumentNumber()];<br>
                     return PointerType::getUnqual(Ty);<br>
                   }<br>
                +  case IITDescriptor::PtrToElt: {<br>
                +    Type *Ty = Tys[D.getArgumentNumber()];<br>
                +    VectorType *VTy = dyn_cast<VectorType>(Ty);<br>
                +    if (!VTy)<br>
                +      llvm_unreachable("Expected an argument of Vector
                Type");<br>
                +    Type *EltTy = VTy->getVectorElementType();<br>
                +    return PointerType::getUnqual(EltTy);<br>
                +  }<br>
                   case IITDescriptor::VecOfPtrsToElt: {<br>
                     Type *Ty = Tys[D.getArgumentNumber()];<br>
                     VectorType *VTy = dyn_cast<VectorType>(Ty);<br>
                @@ -1048,7 +1062,7 @@ bool
                Intrinsic::matchIntrinsicType(Type<br>
                       if (D.getArgumentNumber() >= ArgTys.size())<br>
                         return true;<br>
                       VectorType * ReferenceType =<br>
                -             
                dyn_cast<VectorType>(ArgTys[D.getArgumentNumber()]);<br>
                +       
                dyn_cast<VectorType>(ArgTys[D.getArgumentNumber()]);<br>
                       VectorType *ThisArgType =
                dyn_cast<VectorType>(Ty);<br>
                       if (!ThisArgType || !ReferenceType ||<br>
                           (ReferenceType->getVectorNumElements() !=<br>
                @@ -1064,6 +1078,16 @@ bool
                Intrinsic::matchIntrinsicType(Type<br>
                       PointerType *ThisArgType =
                dyn_cast<PointerType>(Ty);<br>
                       return (!ThisArgType ||
                ThisArgType->getElementType() != ReferenceType);<br>
                     }<br>
                +    case IITDescriptor::PtrToElt: {<br>
                +      if (D.getArgumentNumber() >= ArgTys.size())<br>
                +        return true;<br>
                +      VectorType * ReferenceType =<br>
                +        dyn_cast<VectorType>
                (ArgTys[D.getArgumentNumber()]);<br>
                +      PointerType *ThisArgType =
                dyn_cast<PointerType>(Ty);<br>
                +<br>
                +      return (!ThisArgType || !ReferenceType ||<br>
                +              ThisArgType->getElementType() !=
                ReferenceType->getElementType());<br>
                +    }<br>
                     case IITDescriptor::VecOfPtrsToElt: {<br>
                       if (D.getArgumentNumber() >= ArgTys.size())<br>
                         return true;<br>
                <br>
                Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp<br>
                URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=285876&r1=285875&r2=285876&view=diff"
                  target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
                --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
                (original)<br>
                +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed
                Nov  2 22:23:55 2016<br>
                @@ -1232,10 +1232,11 @@
                X86TargetLowering::X86TargetLowering(con<br>
                       setTruncStoreAction(MVT::v4i32, MVT::v4i8, 
                Legal);<br>
                       setTruncStoreAction(MVT::v4i32, MVT::v4i16,
                Legal);<br>
                     } else {<br>
                -      setOperationAction(ISD::MLOAD,    MVT::v8i32,
                Custom);<br>
                -      setOperationAction(ISD::MLOAD,    MVT::v8f32,
                Custom);<br>
                -      setOperationAction(ISD::MSTORE,   MVT::v8i32,
                Custom);<br>
                -      setOperationAction(ISD::MSTORE,   MVT::v8f32,
                Custom);<br>
                +      for (auto VT : {MVT::v4i32, MVT::v8i32,
                MVT::v2i64, MVT::v4i64,<br>
                +           MVT::v4f32, MVT::v8f32, MVT::v2f64,
                MVT::v4f64}) {<br>
                +        setOperationAction(ISD::MLOAD,  VT, Custom);<br>
                +        setOperationAction(ISD::MSTORE, VT, Custom);<br>
                +      }<br>
                     }<br>
                     setOperationAction(ISD::TRUNCATE,         
                 MVT::i1, Custom);<br>
                     setOperationAction(ISD::TRUNCATE,         
                 MVT::v16i8, Custom);<br>
                @@ -21940,26 +21941,48 @@ static SDValue
                LowerMLOAD(SDValue Op, co<br>
                   SDValue Mask = N->getMask();<br>
                   SDLoc dl(Op);<br>
                <br>
                +  assert((!N->isExpandingLoad() ||
                Subtarget.hasAVX512()) &&<br>
                +         "Expanding masked load is supported on AVX-512
                target only!");<br>
                +<br>
                +  assert((!N->isExpandingLoad() ||
                ScalarVT.getSizeInBits() >= 32) &&<br>
                +         "Expanding masked load is supported for 32 and
                64-bit types only!");<br>
                +<br>
                +  // 4x32, 4x64 and 2x64 vectors of non-expanding loads
                are legal regardless of<br>
                +  // VLX. These types for exp-loads are handled here.<br>
                +  if (!N->isExpandingLoad() &&
                VT.getVectorNumElements() <= 4)<br>
                +    return Op;<br>
                +<br>
                   assert(Subtarget.hasAVX512() &&
                !Subtarget.hasVLX() && !VT.is512BitVector()
                &&<br>
                          "Cannot lower masked load op.");<br>
                <br>
                -  assert(((ScalarVT == MVT::i32 || ScalarVT ==
                MVT::f32) ||<br>
                +  assert((ScalarVT.getSizeInBits() >= 32 ||<br>
                           (Subtarget.hasBWI() &&<br>
                               (ScalarVT == MVT::i8 || ScalarVT ==
                MVT::i16))) &&<br>
                          "Unsupported masked load op.");<br>
                <br>
                   // This operation is legal for targets with VLX, but
                without<br>
                   // VLX the vector should be widened to 512 bit<br>
                -  unsigned NumEltsInWideVec =
                512/VT.getScalarSizeInBits();<br>
                +  unsigned NumEltsInWideVec = 512 /
                VT.getScalarSizeInBits();<br>
                   MVT WideDataVT = MVT::getVectorVT(ScalarVT,
                NumEltsInWideVec);<br>
                -  MVT WideMaskVT = MVT::getVectorVT(MVT::i1,
                NumEltsInWideVec);<br>
                   SDValue Src0 = N->getSrc0();<br>
                   Src0 = ExtendToType(Src0, WideDataVT, DAG);<br>
                +<br>
                +  // Mask element has to be i1<br>
                +  MVT MaskEltTy =
                Mask.getSimpleValueType().getScalarType();<br>
                +  assert((MaskEltTy == MVT::i1 ||
                VT.getVectorNumElements() <= 4) &&<br>
                +         "We handle 4x32, 4x64 and 2x64 vectors only in
                this casse");<br>
                +<br>
                +  MVT WideMaskVT = MVT::getVectorVT(MaskEltTy,
                NumEltsInWideVec);<br>
                +<br>
                   Mask = ExtendToType(Mask, WideMaskVT, DAG, true);<br>
                +  if (MaskEltTy != MVT::i1)<br>
                +    Mask = DAG.getNode(ISD::TRUNCATE, dl,<br>
                +                       MVT::getVectorVT(MVT::i1,
                NumEltsInWideVec), Mask);<br>
                   SDValue NewLoad = DAG.getMaskedLoad(WideDataVT, dl,
                N->getChain(),<br>
                                                     
                 N->getBasePtr(), Mask, Src0,<br>
                                                     
                 N->getMemoryVT(), N->getMemOperand(),<br>
                -                                     
                N->getExtensionType());<br>
                +                                     
                N->getExtensionType(),<br>
                +                                     
                N->isExpandingLoad());<br>
                <br>
                   SDValue Exract = DAG.getNode(ISD::EXTRACT_SUBVECTOR,
                dl, VT,<br>
                                                NewLoad.getValue(0),<br>
                @@ -21977,10 +22000,20 @@ static SDValue
                LowerMSTORE(SDValue Op, c<br>
                   SDValue Mask = N->getMask();<br>
                   SDLoc dl(Op);<br>
                <br>
                +  assert((!N->isCompressingStore() ||
                Subtarget.hasAVX512()) &&<br>
                +         "Expanding masked load is supported on AVX-512
                target only!");<br>
                +<br>
                +  assert((!N->isCompressingStore() ||
                ScalarVT.getSizeInBits() >= 32) &&<br>
                +         "Expanding masked load is supported for 32 and
                64-bit types only!");<br>
                +<br>
                +  // 4x32 and 2x64 vectors of non-compressing stores
                are legal regardless to VLX.<br>
                +  if (!N->isCompressingStore() &&
                VT.getVectorNumElements() <= 4)<br>
                +    return Op;<br>
                +<br>
                   assert(Subtarget.hasAVX512() &&
                !Subtarget.hasVLX() && !VT.is512BitVector()
                &&<br>
                          "Cannot lower masked store op.");<br>
                <br>
                -  assert(((ScalarVT == MVT::i32 || ScalarVT ==
                MVT::f32) ||<br>
                +  assert((ScalarVT.getSizeInBits() >= 32 ||<br>
                           (Subtarget.hasBWI() &&<br>
                               (ScalarVT == MVT::i8 || ScalarVT ==
                MVT::i16))) &&<br>
                           "Unsupported masked store op.");<br>
                @@ -21989,12 +22022,22 @@ static SDValue
                LowerMSTORE(SDValue Op, c<br>
                   // VLX the vector should be widened to 512 bit<br>
                   unsigned NumEltsInWideVec =
                512/VT.getScalarSizeInBits();<br>
                   MVT WideDataVT = MVT::getVectorVT(ScalarVT,
                NumEltsInWideVec);<br>
                -  MVT WideMaskVT = MVT::getVectorVT(MVT::i1,
                NumEltsInWideVec);<br>
                +<br>
                +  // Mask element has to be i1<br>
                +  MVT MaskEltTy =
                Mask.getSimpleValueType().getScalarType();<br>
                +  assert((MaskEltTy == MVT::i1 ||
                VT.getVectorNumElements() <= 4) &&<br>
                +         "We handle 4x32, 4x64 and 2x64 vectors only in
                this casse");<br>
                +<br>
                +  MVT WideMaskVT = MVT::getVectorVT(MaskEltTy,
                NumEltsInWideVec);<br>
                +<br>
                   DataToStore = ExtendToType(DataToStore, WideDataVT,
                DAG);<br>
                   Mask = ExtendToType(Mask, WideMaskVT, DAG, true);<br>
                +  if (MaskEltTy != MVT::i1)<br>
                +    Mask = DAG.getNode(ISD::TRUNCATE, dl,<br>
                +                       MVT::getVectorVT(MVT::i1,
                NumEltsInWideVec), Mask);<br>
                   return DAG.getMaskedStore(N->getChain(), dl,
                DataToStore, N->getBasePtr(),<br>
                                             Mask, N->getMemoryVT(),
                N->getMemOperand(),<br>
                -                            N->isTruncatingStore());<br>
                +                            N->isTruncatingStore(),
                N->isCompressingStore());<br>
                 }<br>
                <br>
                 static SDValue LowerMGATHER(SDValue Op, const
                X86Subtarget &Subtarget,<br>
                @@ -29881,6 +29924,11 @@ static SDValue
                combineMaskedLoad(SDNode<br>
                                                 
                TargetLowering::DAGCombinerInfo &DCI,<br>
                                                  const X86Subtarget
                &Subtarget) {<br>
                   MaskedLoadSDNode *Mld =
                cast<MaskedLoadSDNode>(N);<br>
                +<br>
                +  // TODO: Expanding load with constant mask may be
                optimized as well.<br>
                +  if (Mld->isExpandingLoad())<br>
                +    return SDValue();<br>
                +<br>
                   if (Mld->getExtensionType() == ISD::NON_EXTLOAD) {<br>
                     if (SDValue ScalarLoad =
                reduceMaskedLoadToScalarLoad(Mld, DAG, DCI))<br>
                       return ScalarLoad;<br>
                @@ -29996,6 +30044,10 @@ static SDValue
                reduceMaskedStoreToScalar<br>
                 static SDValue combineMaskedStore(SDNode *N,
                SelectionDAG &DAG,<br>
                                                   const X86Subtarget
                &Subtarget) {<br>
                   MaskedStoreSDNode *Mst =
                cast<MaskedStoreSDNode>(N);<br>
                +<br>
                +  if (Mst->isCompressingStore())<br>
                +    return SDValue();<br>
                +<br>
                   if (!Mst->isTruncatingStore())<br>
                     return reduceMaskedStoreToScalarStore(Mst, DAG);<br>
                <br>
                <br>
                Modified:
                llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td<br>
                URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td?rev=285876&r1=285875&r2=285876&view=diff"
                  target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
                --- llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td
                (original)<br>
                +++ llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td
                Wed Nov  2 22:23:55 2016<br>
                @@ -965,28 +965,23 @@ def X86mstore : PatFrag<(ops
                node:$src1,<br>
                <br>
                 def masked_store_aligned128 : PatFrag<(ops
                node:$src1, node:$src2, node:$src3),<br>
                                          (X86mstore node:$src1,
                node:$src2, node:$src3), [{<br>
                -  if (auto *Store =
                dyn_cast<MaskedStoreSDNode>(N))<br>
                -    return Store->getAlignment() >= 16;<br>
                -  return false;<br>
                +  return
                cast<MaskedStoreSDNode>(N)->getAlignment()
                >= 16;<br>
                 }]>;<br>
                <br>
                 def masked_store_aligned256 : PatFrag<(ops
                node:$src1, node:$src2, node:$src3),<br>
                                          (X86mstore node:$src1,
                node:$src2, node:$src3), [{<br>
                -  if (auto *Store =
                dyn_cast<MaskedStoreSDNode>(N))<br>
                -    return Store->getAlignment() >= 32;<br>
                -  return false;<br>
                +  return
                cast<MaskedStoreSDNode>(N)->getAlignment()
                >= 32;<br>
                 }]>;<br>
                <br>
                 def masked_store_aligned512 : PatFrag<(ops
                node:$src1, node:$src2, node:$src3),<br>
                                          (X86mstore node:$src1,
                node:$src2, node:$src3), [{<br>
                -  if (auto *Store =
                dyn_cast<MaskedStoreSDNode>(N))<br>
                -    return Store->getAlignment() >= 64;<br>
                -  return false;<br>
                +  return
                cast<MaskedStoreSDNode>(N)->getAlignment()
                >= 64;<br>
                 }]>;<br>
                <br>
                 def masked_store_unaligned : PatFrag<(ops
                node:$src1, node:$src2, node:$src3),<br>
                -                         (X86mstore node:$src1,
                node:$src2, node:$src3), [{<br>
                -  return isa<MaskedStoreSDNode>(N);<br>
                +                         (masked_store node:$src1,
                node:$src2, node:$src3), [{<br>
                +  return
                (!cast<MaskedStoreSDNode>(N)->isTruncatingStore())
                &&<br>
                +       
                 (!cast<MaskedStoreSDNode>(N)->isCompressingStore());<br>
                 }]>;<br>
                <br>
                 def X86mCompressingStore : PatFrag<(ops node:$src1,
                node:$src2, node:$src3),<br>
                <br>
                Added: llvm/trunk/test/CodeGen/X86/compress_expand.ll<br>
                URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/compress_expand.ll?rev=285876&view=auto"
                  target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/compress_expand.ll?rev=285876&view=auto</a><br>
==============================================================================<br>
                --- llvm/trunk/test/CodeGen/X86/compress_expand.ll
                (added)<br>
                +++ llvm/trunk/test/CodeGen/X86/compress_expand.ll Wed
                Nov  2 22:23:55 2016<br>
                @@ -0,0 +1,247 @@<br>
                +; NOTE: Assertions have been autogenerated by
                utils/update_llc_test_checks.py<br>
                +; RUN: llc -mattr=+avx512vl,+avx512dq,+avx512bw < %s
                | FileCheck %s --check-prefix=ALL --check-prefix=SKX<br>
                +; RUN: llc -mattr=+avx512f < %s | FileCheck %s
                --check-prefix=ALL --check-prefix=KNL<br>
                +<br>
                +target datalayout =
                "e-m:e-i64:64-f80:128-n8:16:32:64-S128"<br>
                +target triple = "x86_64-unknown-linux-gnu"<br>
                +<br>
                +<br>
                +<br>
                +define <16 x float> @test1(float* %base) {<br>
                +; ALL-LABEL: test1:<br>
                +; ALL:       # BB#0:<br>
                +; ALL-NEXT:    movw $-2049, %ax # imm = 0xF7FF<br>
                +; ALL-NEXT:    kmovw %eax, %k1<br>
                +; ALL-NEXT:    vexpandps (%rdi), %zmm0 {%k1} {z}<br>
                +; ALL-NEXT:    retq<br>
                +  %res = call <16 x float>
                @llvm.masked.expandload.v16f32(float* %base, <16 x
                i1> <i1 true, i1 true, i1 true, i1 true, i1 true,
                i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1
                false, i1 true, i1 true, i1 true, i1 true>, <16 x
                float> undef)<br>
                +  ret <16 x float>%res<br>
                +}<br>
                +<br>
                +define <16 x float> @test2(float* %base, <16 x
                float> %src0) {<br>
                +; ALL-LABEL: test2:<br>
                +; ALL:       # BB#0:<br>
                +; ALL-NEXT:    movw $30719, %ax # imm = 0x77FF<br>
                +; ALL-NEXT:    kmovw %eax, %k1<br>
                +; ALL-NEXT:    vexpandps (%rdi), %zmm0 {%k1}<br>
                +; ALL-NEXT:    retq<br>
                +  %res = call <16 x float>
                @llvm.masked.expandload.v16f32(float* %base, <16 x
                i1> <i1 true, i1 true, i1 true, i1 true, i1 true,
                i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1
                false, i1 true, i1 true, i1 true, i1 false>, <16 x
                float> %src0)<br>
                +  ret <16 x float>%res<br>
                +}<br>
                +<br>
                +define <8 x double> @test3(double* %base, <8 x
                double> %src0, <8 x i1> %mask) {<br>
                +; SKX-LABEL: test3:<br>
                +; SKX:       # BB#0:<br>
                +; SKX-NEXT:    vpsllw $15, %xmm1, %xmm1<br>
                +; SKX-NEXT:    vpmovw2m %xmm1, %k1<br>
                +; SKX-NEXT:    vexpandpd (%rdi), %zmm0 {%k1}<br>
                +; SKX-NEXT:    retq<br>
                +;<br>
                +; KNL-LABEL: test3:<br>
                +; KNL:       # BB#0:<br>
                +; KNL-NEXT:    vpmovsxwq %xmm1, %zmm1<br>
                +; KNL-NEXT:    vpsllq $63, %zmm1, %zmm1<br>
                +; KNL-NEXT:    vptestmq %zmm1, %zmm1, %k1<br>
                +; KNL-NEXT:    vexpandpd (%rdi), %zmm0 {%k1}<br>
                +; KNL-NEXT:    retq<br>
                +  %res = call <8 x double>
                @llvm.masked.expandload.v8f64(double* %base, <8 x
                i1> %mask, <8 x double> %src0)<br>
                +  ret <8 x double>%res<br>
                +}<br>
                +<br>
                +define <4 x float> @test4(float* %base, <4 x
                float> %src0) {<br>
                +; SKX-LABEL: test4:<br>
                +; SKX:       # BB#0:<br>
                +; SKX-NEXT:    movb $7, %al<br>
                +; SKX-NEXT:    kmovb %eax, %k1<br>
                +; SKX-NEXT:    vexpandps (%rdi), %xmm0 {%k1}<br>
                +; SKX-NEXT:    retq<br>
                +;<br>
                +; KNL-LABEL: test4:<br>
                +; KNL:       # BB#0:<br>
                +; KNL-NEXT:    # kill: %XMM0<def>
                %XMM0<kill> %ZMM0<def><br>
                +; KNL-NEXT:    movw $7, %ax<br>
                +; KNL-NEXT:    kmovw %eax, %k1<br>
                +; KNL-NEXT:    vexpandps (%rdi), %zmm0 {%k1}<br>
                +; KNL-NEXT:    # kill: %XMM0<def>
                %XMM0<kill> %ZMM0<kill><br>
                +; KNL-NEXT:    retq<br>
                +  %res = call <4 x float>
                @llvm.masked.expandload.v4f32(float* %base, <4 x
                i1> <i1 true, i1 true, i1 true, i1 false>,
                <4 x float> %src0)<br>
                +  ret <4 x float>%res<br>
                +}<br>
                +<br>
                +define <2 x i64> @test5(i64* %base, <2 x
                i64> %src0) {<br>
                +; SKX-LABEL: test5:<br>
                +; SKX:       # BB#0:<br>
                +; SKX-NEXT:    movb $2, %al<br>
                +; SKX-NEXT:    kmovb %eax, %k1<br>
                +; SKX-NEXT:    vpexpandq (%rdi), %xmm0 {%k1}<br>
                +; SKX-NEXT:    retq<br>
                +;<br>
                +; KNL-LABEL: test5:<br>
                +; KNL:       # BB#0:<br>
                +; KNL-NEXT:    # kill: %XMM0<def>
                %XMM0<kill> %ZMM0<def><br>
                +; KNL-NEXT:    movb $2, %al<br>
                +; KNL-NEXT:    kmovw %eax, %k1<br>
                +; KNL-NEXT:    vpexpandq (%rdi), %zmm0 {%k1}<br>
                +; KNL-NEXT:    # kill: %XMM0<def>
                %XMM0<kill> %ZMM0<kill><br>
                +; KNL-NEXT:    retq<br>
                +  %res = call <2 x i64>
                @llvm.masked.expandload.v2i64(i64* %base, <2 x i1>
                <i1 false, i1 true>, <2 x i64> %src0)<br>
                +  ret <2 x i64>%res<br>
                +}<br>
                +<br>
                +declare <16 x float>
                @llvm.masked.expandload.v16f32(float*, <16 x i1>,
                <16 x float>)<br>
                +declare <8 x double>
                @llvm.masked.expandload.v8f64(double*, <8 x i1>,
                <8 x double>)<br>
                +declare <4 x float> 
                @llvm.masked.expandload.v4f32(float*, <4 x i1>,
                <4 x float>)<br>
                +declare <2 x i64>   
                @llvm.masked.expandload.v2i64(i64*, <2 x i1>,
                <2 x i64>)<br>
                +<br>
                +define void @test6(float* %base, <16 x float> %V)
                {<br>
                +; ALL-LABEL: test6:<br>
                +; ALL:       # BB#0:<br>
                +; ALL-NEXT:    movw $-2049, %ax # imm = 0xF7FF<br>
                +; ALL-NEXT:    kmovw %eax, %k1<br>
                +; ALL-NEXT:    vcompressps %zmm0, (%rdi) {%k1}<br>
                +; ALL-NEXT:    retq<br>
                +  call void @llvm.masked.compressstore.v16f32(<16 x
                float> %V, float* %base, <16 x i1> <i1 true,
                i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1
                true, i1 true, i1 true, i1 true, i1 false, i1 true, i1
                true, i1 true, i1 true>)<br>
                +  ret void<br>
                +}<br>
                +<br>
                +define void @test7(float* %base, <8 x float> %V,
                <8 x i1> %mask) {<br>
                +; SKX-LABEL: test7:<br>
                +; SKX:       # BB#0:<br>
                +; SKX-NEXT:    vpsllw $15, %xmm1, %xmm1<br>
                +; SKX-NEXT:    vpmovw2m %xmm1, %k1<br>
                +; SKX-NEXT:    vcompressps %ymm0, (%rdi) {%k1}<br>
                +; SKX-NEXT:    retq<br>
                +;<br>
                +; KNL-LABEL: test7:<br>
                +; KNL:       # BB#0:<br>
                +; KNL-NEXT:    # kill: %YMM0<def>
                %YMM0<kill> %ZMM0<def><br>
                +; KNL-NEXT:    vpmovsxwq %xmm1, %zmm1<br>
                +; KNL-NEXT:    vpsllq $63, %zmm1, %zmm1<br>
                +; KNL-NEXT:    vptestmq %zmm1, %zmm1, %k0<br>
                +; KNL-NEXT:    kshiftlw $8, %k0, %k0<br>
                +; KNL-NEXT:    kshiftrw $8, %k0, %k1<br>
                +; KNL-NEXT:    vcompressps %zmm0, (%rdi) {%k1}<br>
                +; KNL-NEXT:    retq<br>
                +  call void @llvm.masked.compressstore.v8f32(<8 x
                float> %V, float* %base, <8 x i1> %mask)<br>
                +  ret void<br>
                +}<br>
                +<br>
                +define void @test8(double* %base, <8 x double>
                %V, <8 x i1> %mask) {<br>
                +; SKX-LABEL: test8:<br>
                +; SKX:       # BB#0:<br>
                +; SKX-NEXT:    vpsllw $15, %xmm1, %xmm1<br>
                +; SKX-NEXT:    vpmovw2m %xmm1, %k1<br>
                +; SKX-NEXT:    vcompresspd %zmm0, (%rdi) {%k1}<br>
                +; SKX-NEXT:    retq<br>
                +;<br>
                +; KNL-LABEL: test8:<br>
                +; KNL:       # BB#0:<br>
                +; KNL-NEXT:    vpmovsxwq %xmm1, %zmm1<br>
                +; KNL-NEXT:    vpsllq $63, %zmm1, %zmm1<br>
                +; KNL-NEXT:    vptestmq %zmm1, %zmm1, %k1<br>
                +; KNL-NEXT:    vcompresspd %zmm0, (%rdi) {%k1}<br>
                +; KNL-NEXT:    retq<br>
                +  call void @llvm.masked.compressstore.v8f64(<8 x
                double> %V, double* %base, <8 x i1> %mask)<br>
                +  ret void<br>
                +}<br>
                +<br>
                +define void @test9(i64* %base, <8 x i64> %V,
                <8 x i1> %mask) {<br>
                +; SKX-LABEL: test9:<br>
                +; SKX:       # BB#0:<br>
                +; SKX-NEXT:    vpsllw $15, %xmm1, %xmm1<br>
                +; SKX-NEXT:    vpmovw2m %xmm1, %k1<br>
                +; SKX-NEXT:    vpcompressq %zmm0, (%rdi) {%k1}<br>
                +; SKX-NEXT:    retq<br>
                +;<br>
                +; KNL-LABEL: test9:<br>
                +; KNL:       # BB#0:<br>
                +; KNL-NEXT:    vpmovsxwq %xmm1, %zmm1<br>
                +; KNL-NEXT:    vpsllq $63, %zmm1, %zmm1<br>
                +; KNL-NEXT:    vptestmq %zmm1, %zmm1, %k1<br>
                +; KNL-NEXT:    vpcompressq %zmm0, (%rdi) {%k1}<br>
                +; KNL-NEXT:    retq<br>
                +    call void @llvm.masked.compressstore.v8i64(<8 x
                i64> %V, i64* %base, <8 x i1> %mask)<br>
                +  ret void<br>
                +}<br>
                +<br>
                +define void @test10(i64* %base, <4 x i64> %V,
                <4 x i1> %mask) {<br>
                +; SKX-LABEL: test10:<br>
                +; SKX:       # BB#0:<br>
                +; SKX-NEXT:    vpslld $31, %xmm1, %xmm1<br>
                +; SKX-NEXT:    vptestmd %xmm1, %xmm1, %k1<br>
                +; SKX-NEXT:    vpcompressq %ymm0, (%rdi) {%k1}<br>
                +; SKX-NEXT:    retq<br>
                +;<br>
                +; KNL-LABEL: test10:<br>
                +; KNL:       # BB#0:<br>
                +; KNL-NEXT:    # kill: %YMM0<def>
                %YMM0<kill> %ZMM0<def><br>
                +; KNL-NEXT:    vpslld $31, %xmm1, %xmm1<br>
                +; KNL-NEXT:    vpsrad $31, %xmm1, %xmm1<br>
                +; KNL-NEXT:    vpmovsxdq %xmm1, %ymm1<br>
                +; KNL-NEXT:    vpxord %zmm2, %zmm2, %zmm2<br>
                +; KNL-NEXT:    vinserti64x4 $0, %ymm1, %zmm2, %zmm1<br>
                +; KNL-NEXT:    vpsllq $63, %zmm1, %zmm1<br>
                +; KNL-NEXT:    vptestmq %zmm1, %zmm1, %k1<br>
                +; KNL-NEXT:    vpcompressq %zmm0, (%rdi) {%k1}<br>
                +; KNL-NEXT:    retq<br>
                +    call void @llvm.masked.compressstore.v4i64(<4 x
                i64> %V, i64* %base, <4 x i1> %mask)<br>
                +  ret void<br>
                +}<br>
                +<br>
                +define void @test11(i64* %base, <2 x i64> %V,
                <2 x i1> %mask) {<br>
                +; SKX-LABEL: test11:<br>
                +; SKX:       # BB#0:<br>
                +; SKX-NEXT:    vpsllq $63, %xmm1, %xmm1<br>
                +; SKX-NEXT:    vptestmq %xmm1, %xmm1, %k1<br>
                +; SKX-NEXT:    vpcompressq %xmm0, (%rdi) {%k1}<br>
                +; SKX-NEXT:    retq<br>
                +;<br>
                +; KNL-LABEL: test11:<br>
                +; KNL:       # BB#0:<br>
                +; KNL-NEXT:    # kill: %XMM0<def>
                %XMM0<kill> %ZMM0<def><br>
                +; KNL-NEXT:    vpsllq $63, %xmm1, %xmm1<br>
                +; KNL-NEXT:    vpsrad $31, %xmm1, %xmm1<br>
                +; KNL-NEXT:    vpshufd {{.*#+}} xmm1 = xmm1[1,1,3,3]<br>
                +; KNL-NEXT:    vpxord %zmm2, %zmm2, %zmm2<br>
                +; KNL-NEXT:    vinserti32x4 $0, %xmm1, %zmm2, %zmm1<br>
                +; KNL-NEXT:    vpsllq $63, %zmm1, %zmm1<br>
                +; KNL-NEXT:    vptestmq %zmm1, %zmm1, %k1<br>
                +; KNL-NEXT:    vpcompressq %zmm0, (%rdi) {%k1}<br>
                +; KNL-NEXT:    retq<br>
                +    call void @llvm.masked.compressstore.v2i64(<2 x
                i64> %V, i64* %base, <2 x i1> %mask)<br>
                +  ret void<br>
                +}<br>
                +<br>
                +define void @test12(float* %base, <4 x float> %V,
                <4 x i1> %mask) {<br>
                +; SKX-LABEL: test12:<br>
                +; SKX:       # BB#0:<br>
                +; SKX-NEXT:    vpslld $31, %xmm1, %xmm1<br>
                +; SKX-NEXT:    vptestmd %xmm1, %xmm1, %k1<br>
                +; SKX-NEXT:    vcompressps %xmm0, (%rdi) {%k1}<br>
                +; SKX-NEXT:    retq<br>
                +;<br>
                +; KNL-LABEL: test12:<br>
                +; KNL:       # BB#0:<br>
                +; KNL-NEXT:    # kill: %XMM0<def>
                %XMM0<kill> %ZMM0<def><br>
                +; KNL-NEXT:    vpslld $31, %xmm1, %xmm1<br>
                +; KNL-NEXT:    vpsrad $31, %xmm1, %xmm1<br>
                +; KNL-NEXT:    vpxord %zmm2, %zmm2, %zmm2<br>
                +; KNL-NEXT:    vinserti32x4 $0, %xmm1, %zmm2, %zmm1<br>
                +; KNL-NEXT:    vpslld $31, %zmm1, %zmm1<br>
                +; KNL-NEXT:    vptestmd %zmm1, %zmm1, %k1<br>
                +; KNL-NEXT:    vcompressps %zmm0, (%rdi) {%k1}<br>
                +; KNL-NEXT:    retq<br>
                +    call void @llvm.masked.compressstore.v4f32(<4 x
                float> %V, float* %base, <4 x i1> %mask)<br>
                +  ret void<br>
                +}<br>
                +<br>
                +declare void @llvm.masked.compressstore.v16f32(<16 x
                float>, float* , <16 x i1>)<br>
                +declare void @llvm.masked.compressstore.v8f32(<8 x
                float>, float* , <8 x i1>)<br>
                +declare void @llvm.masked.compressstore.v8f64(<8 x
                double>, double* , <8 x i1>)<br>
                +declare void @llvm.masked.compressstore.v16i32(<16 x
                i32>, i32* , <16 x i1>)<br>
                +declare void @llvm.masked.compressstore.v8i32(<8 x
                i32>, i32* , <8 x i1>)<br>
                +declare void @llvm.masked.compressstore.v8i64(<8 x
                i64>, i64* , <8 x i1>)<br>
                +declare void @llvm.masked.compressstore.v4i32(<4 x
                i32>, i32* , <4 x i1>)<br>
                +declare void @llvm.masked.compressstore.v4f32(<4 x
                float>, float* , <4 x i1>)<br>
                +declare void @llvm.masked.compressstore.v4i64(<4 x
                i64>, i64* , <4 x i1>)<br>
                +declare void @llvm.masked.compressstore.v2i64(<2 x
                i64>, i64* , <2 x i1>)<br>
                <br>
                Modified: llvm/trunk/utils/TableGen/CodeGenTarget.cpp<br>
                URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenTarget.cpp?rev=285876&r1=285875&r2=285876&view=diff"
                  target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/CodeGenTarget.cpp?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
                --- llvm/trunk/utils/TableGen/CodeGenTarget.cpp
                (original)<br>
                +++ llvm/trunk/utils/TableGen/CodeGenTarget.cpp Wed Nov 
                2 22:23:55 2016<br>
                @@ -550,8 +550,7 @@
                CodeGenIntrinsic::CodeGenIntrinsic(Recor<br>
                       // overloaded, all the types can be specified
                directly.<br>
                     
                 assert(((!TyEl->isSubClassOf("LLVMExtendedType")
                &&<br>
                               
                !TyEl->isSubClassOf("LLVMTruncatedType") &&<br>
                -             
                 !TyEl->isSubClassOf("LLVMVectorSameWidth")
                &&<br>
                -             
                 !TyEl->isSubClassOf("LLVMPointerToElt")) ||<br>
                +             
                 !TyEl->isSubClassOf("LLVMVectorSameWidth")) ||<br>
                               VT == MVT::iAny || VT == MVT::vAny)
                &&<br>
                              "Expected iAny or vAny type");<br>
                     } else<br>
                <br>
                Modified: llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp<br>
                URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp?rev=285876&r1=285875&r2=285876&view=diff"
                  target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp?rev=285876&r1=285875&r2=285876&view=diff</a><br>
==============================================================================<br>
                --- llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp
                (original)<br>
                +++ llvm/trunk/utils/TableGen/IntrinsicEmitter.cpp Wed
                Nov  2 22:23:55 2016<br>
                @@ -213,10 +213,11 @@ enum IIT_Info {<br>
                   IIT_HALF_VEC_ARG = 30,<br>
                   IIT_SAME_VEC_WIDTH_ARG = 31,<br>
                   IIT_PTR_TO_ARG = 32,<br>
                -  IIT_VEC_OF_PTRS_TO_ELT = 33,<br>
                -  IIT_I128 = 34,<br>
                -  IIT_V512 = 35,<br>
                -  IIT_V1024 = 36<br>
                +  IIT_PTR_TO_ELT = 33,<br>
                +  IIT_VEC_OF_PTRS_TO_ELT = 34,<br>
                +  IIT_I128 = 35,<br>
                +  IIT_V512 = 36,<br>
                +  IIT_V1024 = 37<br>
                 };<br>
                <br>
                <br>
                @@ -277,6 +278,8 @@ static void EncodeFixedType(Record
                *R, s<br>
                       Sig.push_back(IIT_PTR_TO_ARG);<br>
                     else if
                (R->isSubClassOf("LLVMVectorOfPointersToElt"))<br>
                       Sig.push_back(IIT_VEC_OF_PTRS_TO_ELT);<br>
                +    else if (R->isSubClassOf("LLVMPointerToElt"))<br>
                +      Sig.push_back(IIT_PTR_TO_ELT);<br>
                     else<br>
                       Sig.push_back(IIT_ARG);<br>
                     return Sig.push_back((Number << 3) |
                ArgCodes[Number]);<br>
                <br>
                <br>
                _______________________________________________<br>
                llvm-commits mailing list<br>
                <a moz-do-not-send="true"
                  href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>
                <a moz-do-not-send="true"
                  href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits"
                  target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><o:p></o:p></p>
            </blockquote>
          </div>
          <p class="MsoNormal"
            style="margin-left:46.2pt;text-indent:-17.85pt"><o:p> </o:p></p>
        </div>
      </div>
      <p>---------------------------------------------------------------------<br>
        Intel Israel (74) Limited</p>
      <p>This e-mail and any attachments may contain confidential
        material for<br>
        the sole use of the intended recipient(s). Any review or
        distribution<br>
        by others is strictly prohibited. If you are not the intended<br>
        recipient, please contact the sender and delete all copies.</p>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
llvm-commits mailing list
<a class="moz-txt-link-abbreviated" href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>