<div dir="ltr">Ok I've made another fix in r348104<div><br><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">~Craig</div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr">On Sat, Dec 1, 2018 at 11:28 AM Craig Topper <<a href="mailto:craig.topper@gmail.com">craig.topper@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>I was afraid of that. I thought I had checked whether InstCombine would remove the bitcast here, but I guess I didn't or didn't do it right. I'll see what I can do to fix this.</div><br clear="all"><div><div dir="ltr" class="m_7845153108177002991gmail_signature" data-smartmail="gmail_signature">~Craig</div></div><br></div><br><div class="gmail_quote"><div dir="ltr">On Sat, Dec 1, 2018 at 4:39 AM Johan Engelen <<a href="mailto:jbc.engelen@gmail.com" target="_blank">jbc.engelen@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div dir="ltr">Hello Craig,<div> Thank you for the quick response and fix.</div><div>However, the improvement turns out to be quite fragile. If I run `opt` on the original testcase, and run the output through `llc` then the previous very long assembly output results. (things work for a bitcast from <16 x i1> to i16, but not for a <16 x i1>* store) </div><div>Godbolt link: <a href="https://llvm.godbolt.org/z/j1ob9w" target="_blank">https://llvm.godbolt.org/z/j1ob9w</a></div><div><br></div><div>regards,</div><div> Johan</div><div><br></div><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr">On Tue, Nov 27, 2018 at 4:00 AM Craig Topper <<a href="mailto:craig.topper@gmail.com" target="_blank">craig.topper@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">We should handle this a lot better after r34763<div><br><div><div dir="ltr" class="m_7845153108177002991m_1621764481283116402m_-8733577381501617668gmail_signature" data-smartmail="gmail_signature">~Craig</div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr">On Mon, Nov 26, 2018 at 3:13 PM Craig Topper <<a href="mailto:craig.topper@gmail.com" target="_blank">craig.topper@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div dir="ltr"><div>Here's a quick patch that fixes this. I don't know to avoid it in IR. I haven't checked any other tests, but it does fix your case. I'll try to put up a real phabricator tonight or tomorrow.</div><div><br></div><div><div>diff --git a/lib/Target/X86/X86ISelLowering.cpp b/lib/Target/X86/X86ISelLowering.cpp</div><div>index e31f2a6..d79c0be 100644</div><div>--- a/lib/Target/X86/X86ISelLowering.cpp</div><div>+++ b/lib/Target/X86/X86ISelLowering.cpp</div><div>@@ -4837,6 +4837,11 @@ bool X86TargetLowering::isCheapToSpeculateCtlz() const {</div><div><br></div><div> bool X86TargetLowering::isLoadBitCastBeneficial(EVT LoadVT,</div><div> EVT BitcastVT) const {</div><div>+ if (!LoadVT.isVector() && BitcastVT.isVector() &&</div><div>+ BitcastVT.getVectorElementType() == MVT::i1 &&</div><div>+ !Subtarget.hasAVX512())</div><div>+ return false;</div><div>+</div><div> if (!Subtarget.hasDQI() && BitcastVT == MVT::v8i1)</div><div> return false;</div></div><div><br></div><br clear="all"><div><div dir="ltr" class="m_7845153108177002991m_1621764481283116402m_-8733577381501617668m_7946101956755504508gmail_signature">~Craig</div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr">On Mon, Nov 26, 2018 at 2:51 PM Johan Engelen via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr">Hi all,<div> I've run into a case where the optimizer seems to be having trouble doing the "obvious" thing.</div><div><br></div><div>Consider this code:</div><div>```</div><div><div style="color:rgb(0,0,0);background-color:rgb(255,255,254)"><div><span style="color:rgb(0,0,255)">define</span> <span style="color:rgb(0,128,128)">i16</span> <span style="color:rgb(0,17,136)">@foo</span>(<<span style="color:rgb(9,136,90)">8</span> x <span style="color:rgb(0,128,128)">i16</span>>* <span style="color:rgb(0,0,255)">dereferenceable</span>(<span style="color:rgb(9,136,90)">16</span>) <span style="color:rgb(0,17,136)">%egress</span>, <<span style="color:rgb(9,136,90)">16</span> x <span style="color:rgb(0,128,128)">i8</span>> <span style="color:rgb(0,17,136)">%a0</span>) {</div><div> <span style="color:rgb(0,17,136)"> %a1</span> = icmp slt <<span style="color:rgb(9,136,90)">16</span> x <span style="color:rgb(0,128,128)">i8</span>> <span style="color:rgb(0,17,136)">%a0</span>, <span style="color:rgb(221,0,0)">zeroinitializer</span></div><div> <span style="color:rgb(0,17,136)"> %a2</span> = bitcast <<span style="color:rgb(9,136,90)">16</span> x <span style="color:rgb(0,128,128)">i1</span>> <span style="color:rgb(0,17,136)">%a1</span> <span style="color:rgb(0,0,255)">to</span> <span style="color:rgb(0,128,128)">i16</span></div><div> <span style="color:rgb(0,17,136)"> %astore</span> = getelementptr inbounds <<span style="color:rgb(9,136,90)">8</span> x <span style="color:rgb(0,128,128)">i16</span>>, <<span style="color:rgb(9,136,90)">8</span> x <span style="color:rgb(0,128,128)">i16</span>>* <span style="color:rgb(0,17,136)">%egress</span>, <span style="color:rgb(0,128,128)">i64</span> <span style="color:rgb(9,136,90)">0</span>, <span style="color:rgb(0,128,128)">i64</span> <span style="color:rgb(9,136,90)">7</span></div><div> <span style="color:rgb(0,128,0)"> ;store i16 %a2, i16* %astore</span></div><div> ret <span style="color:rgb(0,128,128)">i16</span> <span style="color:rgb(0,17,136)">%a2</span></div><div>}</div>```</div><div style="color:rgb(0,0,0);background-color:rgb(255,255,254)">The optimizer recognizes this and llc nicely outputs a vpmovmskb instruction:</div><div style="color:rgb(0,0,0);background-color:rgb(255,255,254)">```</div><div style="color:rgb(0,0,0);background-color:rgb(255,255,254)"><div><div><span style="color:rgb(0,128,128)">foo:</span> <span style="color:rgb(0,128,0)"># @foo</span></div><div> <span style="color:rgb(0,0,255)"> vpmovmskb</span> <span style="color:rgb(72,100,170)">eax</span>, <span style="color:rgb(72,100,170)">xmm0</span></div><div> <span style="color:rgb(0,0,255)"> ret</span></div></div><div><span style="color:rgb(0,0,255)">```</span></div></div><div style="color:rgb(0,0,0);background-color:rgb(255,255,254)"><br></div><div style="color:rgb(0,0,0);background-color:rgb(255,255,254)">Writing to the output vector also works well:</div><div style="color:rgb(0,0,0);background-color:rgb(255,255,254)">```</div><div style="color:rgb(0,0,0);background-color:rgb(255,255,254)"><div><span style="color:rgb(0,0,255)">define</span> <span style="color:rgb(0,128,128)">void</span> <span style="color:rgb(0,17,136)">@writing</span>(<<span style="color:rgb(9,136,90)">8</span> x <span style="color:rgb(0,128,128)">i16</span>>* <span style="color:rgb(0,0,255)">dereferenceable</span>(<span style="color:rgb(9,136,90)">16</span>) <span style="color:rgb(0,17,136)">%egress</span>, <<span style="color:rgb(9,136,90)">16</span> x <span style="color:rgb(0,128,128)">i8</span>> <span style="color:rgb(0,17,136)">%a0</span>) {</div><div> <span style="color:rgb(0,17,136)"> %astore</span> = getelementptr inbounds <<span style="color:rgb(9,136,90)">8</span> x <span style="color:rgb(0,128,128)">i16</span>>, <<span style="color:rgb(9,136,90)">8</span> x <span style="color:rgb(0,128,128)">i16</span>>* <span style="color:rgb(0,17,136)">%egress</span>, <span style="color:rgb(0,128,128)">i64</span> <span style="color:rgb(9,136,90)">0</span>, <span style="color:rgb(0,128,128)">i64</span> <span style="color:rgb(9,136,90)">7</span></div><div> store <span style="color:rgb(0,128,128)">i16</span> <span style="color:rgb(9,136,90)">123</span>, <span style="color:rgb(0,128,128)">i16*</span> <span style="color:rgb(0,17,136)">%astore</span></div><div> ret <span style="color:rgb(0,128,128)">void</span></div><div>}</div><div>```</div><div>outputs:</div><div>```</div><div><div><div><span style="color:rgb(0,128,128)">writing:</span> <span style="color:rgb(0,128,0)"># @writing</span></div><div> <span style="color:rgb(0,0,255)"> mov</span> <span style="color:rgb(0,128,128)">word</span> <span style="color:rgb(0,128,128)">ptr</span> [<span style="color:rgb(72,100,170)">rdi</span> + <span style="color:rgb(9,136,90)">14</span>], <span style="color:rgb(9,136,90)">123</span></div><div> <span style="color:rgb(0,0,255)"> ret</span></div></div></div><div><span style="color:rgb(0,0,255)">```</span></div><div><br></div><div>Now, combining these two by uncommenting the store in `foo()` suddenly results in a very large function, instead of just:</div><div><div><div><span style="color:rgb(0,0,255)"> vpmovmskb</span> <span style="color:rgb(72,100,170)">eax</span>, <span style="color:rgb(72,100,170)">xmm0</span></div><div><span style="color:rgb(0,0,255)"> mov</span> <span style="color:rgb(0,128,128)">word</span> <span style="color:rgb(0,128,128)">ptr</span> [<span style="color:rgb(72,100,170)">rdi</span> + <span style="color:rgb(9,136,90)">14</span>], <span style="color:rgb(9,136,90)">ax</span></div><div><span style="color:rgb(0,0,255)"> ret</span></div></div><br class="m_7845153108177002991m_1621764481283116402m_-8733577381501617668m_7946101956755504508m_-7502635116809966063gmail-Apple-interchange-newline"></div><div>Is there something wrong with my IR code, or is the optimizer somehow confused? Can I rewrite the code such that the optimizer does understand?</div><div><br></div><div>Godbolt link: <a href="https://llvm.godbolt.org/z/OgExDk" target="_blank">https://llvm.godbolt.org/z/OgExDk</a></div><div><br></div><div>Thanks a lot for the help.</div><div>Cheers,</div><div> Johan</div><div><br></div></div></div></div></div></div></div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div>
</blockquote></div>
</blockquote></div>
</blockquote></div>
</blockquote></div>