<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal">Do you know what ISA features its being compiled for? I think I shouldn’t be doing this unless at least SSE4.1 is supported, but I don’t know if that would cause a failure.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><a name="_____replyseparator"></a><b>From:</b> Jordan Rupprecht <rupprecht@google.com>
<br>
<b>Sent:</b> Monday, July 8, 2019 4:34 PM<br>
<b>To:</b> Topper, Craig <craig.topper@intel.com><br>
<b>Cc:</b> llvm-commits <llvm-commits@lists.llvm.org><br>
<b>Subject:</b> Re: [llvm] r364977 - [X86] Add a DAG combine for turning *_extend_vector_inreg+load into an appropriate extload if the load isn't volatile.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">FYI, we're seeing some kinda strange test failures after this patch. I'm unable to get a reproducer, but it's on some utility code that is (at a very high level) implementing a vector<bool>-type thing, and is now returning 0 for some elements
that were previously set.<o:p></o:p></p>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal">On Tue, Jul 2, 2019 at 4:20 PM Craig Topper via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>> wrote:<o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<p class="MsoNormal">Author: ctopper<br>
Date: Tue Jul 2 16:20:03 2019<br>
New Revision: 364977<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=364977&view=rev" target="_blank">
http://llvm.org/viewvc/llvm-project?rev=364977&view=rev</a><br>
Log:<br>
[X86] Add a DAG combine for turning *_extend_vector_inreg+load into an appropriate extload if the load isn't volatile.<br>
<br>
Remove the corresponding isel patterns that did the same thing without checking for volatile.<br>
<br>
This fixes another variation of PR42079<br>
<br>
Modified:<br>
llvm/trunk/lib/Target/X86/X86ISelLowering.cpp<br>
llvm/trunk/lib/Target/X86/X86InstrAVX512.td<br>
llvm/trunk/lib/Target/X86/X86InstrSSE.td<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=364977&r1=364976&r2=364977&view=diff" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=364977&r1=364976&r2=364977&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)<br>
+++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Jul 2 16:20:03 2019<br>
@@ -1876,6 +1876,7 @@ X86TargetLowering::X86TargetLowering(con<br>
setTargetDAGCombine(ISD::SIGN_EXTEND);<br>
setTargetDAGCombine(ISD::SIGN_EXTEND_INREG);<br>
setTargetDAGCombine(ISD::ANY_EXTEND_VECTOR_INREG);<br>
+ setTargetDAGCombine(ISD::SIGN_EXTEND_VECTOR_INREG);<br>
setTargetDAGCombine(ISD::ZERO_EXTEND_VECTOR_INREG);<br>
setTargetDAGCombine(ISD::SINT_TO_FP);<br>
setTargetDAGCombine(ISD::UINT_TO_FP);<br>
@@ -43914,16 +43915,35 @@ static SDValue combinePMULDQ(SDNode *N,<br>
}<br>
<br>
static SDValue combineExtInVec(SDNode *N, SelectionDAG &DAG,<br>
+ TargetLowering::DAGCombinerInfo &DCI,<br>
const X86Subtarget &Subtarget) {<br>
+ EVT VT = N->getValueType(0);<br>
+ SDValue In = N->getOperand(0);<br>
+<br>
+ // Try to merge vector loads and extend_inreg to an extload.<br>
+ if (!DCI.isBeforeLegalizeOps() && ISD::isNormalLoad(In.getNode()) &&<br>
+ In.hasOneUse()) {<br>
+ auto *Ld = cast<LoadSDNode>(In);<br>
+ if (!Ld->isVolatile()) {<br>
+ MVT SVT = In.getSimpleValueType().getVectorElementType();<br>
+ ISD::LoadExtType Ext = N->getOpcode() == ISD::SIGN_EXTEND_VECTOR_INREG ? ISD::SEXTLOAD : ISD::ZEXTLOAD;<br>
+ EVT MemVT = EVT::getVectorVT(*DAG.getContext(), SVT,<br>
+ VT.getVectorNumElements());<br>
+ SDValue Load =<br>
+ DAG.getExtLoad(Ext, SDLoc(N), VT, Ld->getChain(), Ld->getBasePtr(),<br>
+ Ld->getPointerInfo(), MemVT, Ld->getAlignment(),<br>
+ Ld->getMemOperand()->getFlags());<br>
+ DAG.ReplaceAllUsesOfValueWith(SDValue(Ld, 1), Load.getValue(1));<br>
+ return Load;<br>
+ }<br>
+ }<br>
+<br>
// Disabling for widening legalization for now. We can enable if we find a<br>
// case that needs it. Otherwise it can be deleted when we switch to<br>
// widening legalization.<br>
if (ExperimentalVectorWideningLegalization)<br>
return SDValue();<br>
<br>
- EVT VT = N->getValueType(0);<br>
- SDValue In = N->getOperand(0);<br>
-<br>
// Combine (ext_invec (ext_invec X)) -> (ext_invec X)<br>
const TargetLowering &TLI = DAG.getTargetLoweringInfo();<br>
if (In.getOpcode() == N->getOpcode() &&<br>
@@ -43932,7 +43952,7 @@ static SDValue combineExtInVec(SDNode *N<br>
<br>
// Attempt to combine as a shuffle.<br>
// TODO: SSE41 support<br>
- if (Subtarget.hasAVX()) {<br>
+ if (Subtarget.hasAVX() && N->getOpcode() != ISD::SIGN_EXTEND_VECTOR_INREG) {<br>
SDValue Op(N, 0);<br>
if (TLI.isTypeLegal(VT) && TLI.isTypeLegal(In.getValueType()))<br>
if (SDValue Res = combineX86ShufflesRecursively(Op, DAG, Subtarget))<br>
@@ -44010,7 +44030,9 @@ SDValue X86TargetLowering::PerformDAGCom<br>
case ISD::SIGN_EXTEND: return combineSext(N, DAG, DCI, Subtarget);<br>
case ISD::SIGN_EXTEND_INREG: return combineSignExtendInReg(N, DAG, Subtarget);<br>
case ISD::ANY_EXTEND_VECTOR_INREG:<br>
- case ISD::ZERO_EXTEND_VECTOR_INREG: return combineExtInVec(N, DAG, Subtarget);<br>
+ case ISD::SIGN_EXTEND_VECTOR_INREG:<br>
+ case ISD::ZERO_EXTEND_VECTOR_INREG: return combineExtInVec(N, DAG, DCI,<br>
+ Subtarget);<br>
case ISD::SETCC: return combineSetCC(N, DAG, Subtarget);<br>
case X86ISD::SETCC: return combineX86SetCC(N, DAG, Subtarget);<br>
case X86ISD::BRCOND: return combineBrCond(N, DAG, Subtarget);<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86InstrAVX512.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrAVX512.td?rev=364977&r1=364976&r2=364977&view=diff" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrAVX512.td?rev=364977&r1=364976&r2=364977&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86InstrAVX512.td (original)<br>
+++ llvm/trunk/lib/Target/X86/X86InstrAVX512.td Tue Jul 2 16:20:03 2019<br>
@@ -9632,21 +9632,15 @@ multiclass AVX512_pmovx_patterns<string<br>
(!cast<I>(OpcPrefix#BWZ128rm) addr:$src)>;<br>
def : Pat<(v8i16 (InVecOp (v16i8 (vzload_v2i64 addr:$src)))),<br>
(!cast<I>(OpcPrefix#BWZ128rm) addr:$src)>;<br>
- def : Pat<(v8i16 (InVecOp (loadv16i8 addr:$src))),<br>
- (!cast<I>(OpcPrefix#BWZ128rm) addr:$src)>;<br>
}<br>
let Predicates = [HasVLX] in {<br>
def : Pat<(v4i32 (InVecOp (bc_v16i8 (v4i32 (scalar_to_vector (loadi32 addr:$src)))))),<br>
(!cast<I>(OpcPrefix#BDZ128rm) addr:$src)>;<br>
def : Pat<(v4i32 (InVecOp (v16i8 (vzload_v4i32 addr:$src)))),<br>
(!cast<I>(OpcPrefix#BDZ128rm) addr:$src)>;<br>
- def : Pat<(v4i32 (InVecOp (loadv16i8 addr:$src))),<br>
- (!cast<I>(OpcPrefix#BDZ128rm) addr:$src)>;<br>
<br>
def : Pat<(v2i64 (InVecOp (bc_v16i8 (v4i32 (scalar_to_vector (extloadi32i16 addr:$src)))))),<br>
(!cast<I>(OpcPrefix#BQZ128rm) addr:$src)>;<br>
- def : Pat<(v2i64 (InVecOp (loadv16i8 addr:$src))),<br>
- (!cast<I>(OpcPrefix#BQZ128rm) addr:$src)>;<br>
<br>
def : Pat<(v4i32 (InVecOp (bc_v8i16 (v2i64 (scalar_to_vector (loadi64 addr:$src)))))),<br>
(!cast<I>(OpcPrefix#WDZ128rm) addr:$src)>;<br>
@@ -9654,15 +9648,11 @@ multiclass AVX512_pmovx_patterns<string<br>
(!cast<I>(OpcPrefix#WDZ128rm) addr:$src)>;<br>
def : Pat<(v4i32 (InVecOp (v8i16 (vzload_v2i64 addr:$src)))),<br>
(!cast<I>(OpcPrefix#WDZ128rm) addr:$src)>;<br>
- def : Pat<(v4i32 (InVecOp (loadv8i16 addr:$src))),<br>
- (!cast<I>(OpcPrefix#WDZ128rm) addr:$src)>;<br>
<br>
def : Pat<(v2i64 (InVecOp (bc_v8i16 (v4i32 (scalar_to_vector (loadi32 addr:$src)))))),<br>
(!cast<I>(OpcPrefix#WQZ128rm) addr:$src)>;<br>
def : Pat<(v2i64 (InVecOp (v8i16 (vzload_v4i32 addr:$src)))),<br>
(!cast<I>(OpcPrefix#WQZ128rm) addr:$src)>;<br>
- def : Pat<(v2i64 (InVecOp (loadv8i16 addr:$src))),<br>
- (!cast<I>(OpcPrefix#WQZ128rm) addr:$src)>;<br>
<br>
def : Pat<(v2i64 (InVecOp (bc_v4i32 (v2i64 (scalar_to_vector (loadi64 addr:$src)))))),<br>
(!cast<I>(OpcPrefix#DQZ128rm) addr:$src)>;<br>
@@ -9670,37 +9660,27 @@ multiclass AVX512_pmovx_patterns<string<br>
(!cast<I>(OpcPrefix#DQZ128rm) addr:$src)>;<br>
def : Pat<(v2i64 (InVecOp (v4i32 (vzload_v2i64 addr:$src)))),<br>
(!cast<I>(OpcPrefix#DQZ128rm) addr:$src)>;<br>
- def : Pat<(v2i64 (InVecOp (loadv4i32 addr:$src))),<br>
- (!cast<I>(OpcPrefix#DQZ128rm) addr:$src)>;<br>
}<br>
let Predicates = [HasVLX] in {<br>
def : Pat<(v8i32 (InVecOp (bc_v16i8 (v2i64 (scalar_to_vector (loadi64 addr:$src)))))),<br>
(!cast<I>(OpcPrefix#BDZ256rm) addr:$src)>;<br>
def : Pat<(v8i32 (InVecOp (v16i8 (vzload_v2i64 addr:$src)))),<br>
(!cast<I>(OpcPrefix#BDZ256rm) addr:$src)>;<br>
- def : Pat<(v8i32 (InVecOp (loadv16i8 addr:$src))),<br>
- (!cast<I>(OpcPrefix#BDZ256rm) addr:$src)>;<br>
<br>
def : Pat<(v4i64 (InVecOp (bc_v16i8 (v4i32 (scalar_to_vector (loadi32 addr:$src)))))),<br>
(!cast<I>(OpcPrefix#BQZ256rm) addr:$src)>;<br>
def : Pat<(v4i64 (InVecOp (v16i8 (vzload_v4i32 addr:$src)))),<br>
(!cast<I>(OpcPrefix#BQZ256rm) addr:$src)>;<br>
- def : Pat<(v4i64 (InVecOp (loadv16i8 addr:$src))),<br>
- (!cast<I>(OpcPrefix#BQZ256rm) addr:$src)>;<br>
<br>
def : Pat<(v4i64 (InVecOp (bc_v8i16 (v2i64 (scalar_to_vector (loadi64 addr:$src)))))),<br>
(!cast<I>(OpcPrefix#WQZ256rm) addr:$src)>;<br>
def : Pat<(v4i64 (InVecOp (v8i16 (vzload_v2i64 addr:$src)))),<br>
(!cast<I>(OpcPrefix#WQZ256rm) addr:$src)>;<br>
- def : Pat<(v4i64 (InVecOp (loadv8i16 addr:$src))),<br>
- (!cast<I>(OpcPrefix#WQZ256rm) addr:$src)>;<br>
}<br>
// 512-bit patterns<br>
let Predicates = [HasAVX512] in {<br>
def : Pat<(v8i64 (InVecOp (bc_v16i8 (v2i64 (scalar_to_vector (loadi64 addr:$src)))))),<br>
(!cast<I>(OpcPrefix#BQZrm) addr:$src)>;<br>
- def : Pat<(v8i64 (InVecOp (loadv16i8 addr:$src))),<br>
- (!cast<I>(OpcPrefix#BQZrm) addr:$src)>;<br>
}<br>
}<br>
<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=364977&r1=364976&r2=364977&view=diff" target="_blank">
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=364977&r1=364976&r2=364977&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86InstrSSE.td (original)<br>
+++ llvm/trunk/lib/Target/X86/X86InstrSSE.td Tue Jul 2 16:20:03 2019<br>
@@ -4947,8 +4947,6 @@ multiclass SS41I_pmovx_avx2_patterns<str<br>
(!cast<I>(OpcPrefix#BDYrm) addr:$src)>;<br>
def : Pat<(v8i32 (InVecOp (v16i8 (vzload_v2i64 addr:$src)))),<br>
(!cast<I>(OpcPrefix#BDYrm) addr:$src)>;<br>
- def : Pat<(v8i32 (InVecOp (loadv16i8 addr:$src))),<br>
- (!cast<I>(OpcPrefix#BDYrm) addr:$src)>;<br>
<br>
def : Pat<(v4i64 (ExtOp (loadv4i32 addr:$src))),<br>
(!cast<I>(OpcPrefix#DQYrm) addr:$src)>;<br>
@@ -4957,15 +4955,11 @@ multiclass SS41I_pmovx_avx2_patterns<str<br>
(!cast<I>(OpcPrefix#BQYrm) addr:$src)>;<br>
def : Pat<(v4i64 (InVecOp (v16i8 (vzload_v2i64 addr:$src)))),<br>
(!cast<I>(OpcPrefix#BQYrm) addr:$src)>;<br>
- def : Pat<(v4i64 (InVecOp (loadv16i8 addr:$src))),<br>
- (!cast<I>(OpcPrefix#BQYrm) addr:$src)>;<br>
<br>
def : Pat<(v4i64 (InVecOp (bc_v8i16 (v2i64 (scalar_to_vector (loadi64 addr:$src)))))),<br>
(!cast<I>(OpcPrefix#WQYrm) addr:$src)>;<br>
def : Pat<(v4i64 (InVecOp (v8i16 (vzload_v2i64 addr:$src)))),<br>
(!cast<I>(OpcPrefix#WQYrm) addr:$src)>;<br>
- def : Pat<(v4i64 (InVecOp (loadv8i16 addr:$src))),<br>
- (!cast<I>(OpcPrefix#WQYrm) addr:$src)>;<br>
}<br>
}<br>
<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><o:p></o:p></p>
</blockquote>
</div>
</div>
</body>
</html>