<div dir="ltr"><div>Setting the ISD::ROTL to Expand doesn't work? (via SetOperation)</div><div><br></div><div>You could also do a Custom hook if that's what you're looking for.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Nov 3, 2016 at 5:12 PM, Phil Tomson <span dir="ltr"><<a href="mailto:phil.a.tomson@gmail.com" target="_blank">phil.a.tomson@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div>... or perhaps to rephrase:<br><br></div>In 3.9 it seems to be doing a smaller combine much sooner, whereas in 3.6 it deferred that till later in the instruction selection pattern matching - the latter was giving us better results because it seems to match a larger pattern than the former did in the earlier stage.<span class="HOEnZb"><font color="#888888"><br><br></font></span></div><span class="HOEnZb"><font color="#888888">Phil<br></font></span></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Nov 3, 2016 at 2:07 PM, Phil Tomson <span dir="ltr"><<a href="mailto:phil.a.tomson@gmail.com" target="_blank">phil.a.tomson@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left-color:rgb(204,204,204);border-left-width:1px;border-left-style:solid"><div dir="ltr"><div><div><div><div><div><div>Is there any way to get it to delay this optimization where it goes from this:<span><br><br>Initial selection DAG: BB#0 'bclr64:entry'<br>SelectionDAG has 14 nodes:<br> t0: ch = EntryToken<br> t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0<br> t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1<br> t6: i64 = sub t4, Constant:i64<1><br> t7: i64 = shl Constant:i64<1>, t6<br> t9: i64 = xor t7, Constant:i64<-1><br> t10: i64 = and t2, t9<br> t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10<br> t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1<br><br><br><br>Combining: t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1<br><br>Combining: t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10<br><br>Combining: t11: i64 = Register %R1<br><br>Combining: t10: i64 = and t2, t9<br><br>Combining: t9: i64 = xor t7, Constant:i64<-1><br> ... into: t15: i64 = rotl Constant:i64<-2>, t6<br><br></span></div>...to this:<br><br>Optimized lowered selection DAG: BB#0 'bclr64:entry'<br>SelectionDAG has 13 nodes:<span><br> t0: ch = EntryToken<br> t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0<br> t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1<br></span><span> t17: i64 = add t4, Constant:i64<-1><br></span><span> t15: i64 = rotl Constant:i64<-2>, t17<br></span><span> t10: i64 = and t2, t15<br></span><span> t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10<br> t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1<br><br><br></span></div>That combining of the xor & and there ends up giving us suboptimal results as compared with 3.6.<br><br></div>For example, in 3.6 the generated code is simply:<br><br>bclr64: <wbr> # @bclr64<br># BB#0: <wbr> # %entry<br> addI r1, r1, -1, 64<span><br> bclr r1, r0, r1, 64<br></span> jabs r511<br><br></div>Whereas with 3.9 the generated code is:<br><br>bclr64: <wbr> # @bclr64<br># BB#0: <wbr> # %entry<br> addI r1, r1, -1, 64<br> movimm r2, -2, 64<br> rol r1, r2, r1, 64<br> bitop1 r1, r0, r1, AND, 64<br> jabs r511<br><br><br></div>... it seems to be negatively impacting some of our larger benchmarks as well that used to contains several bclr (bit clear) commands but now contain much less.<span class="m_-1912501019048824133HOEnZb"><font color="#888888"><br><br></font></span></div><span class="m_-1912501019048824133HOEnZb"><font color="#888888">Phil<br><div><div><div><br><br><div><div><div><br></div></div></div></div></div></div></font></span></div><div class="m_-1912501019048824133HOEnZb"><div class="m_-1912501019048824133h5"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Nov 2, 2016 at 4:10 PM, Ryan Taylor <span dir="ltr"><<a href="mailto:ryta1203@gmail.com" target="_blank">ryta1203@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left-color:rgb(204,204,204);border-left-width:1px;border-left-style:solid"><div dir="ltr">I believe some of the ISDs were introduced to allow for DAG optimizations under the assumption that some of the major architectures directly support these types of instructions. <div><br><div><div><div>-Ryan</div></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="m_-1912501019048824133m_-2356569710591101224h5">On Wed, Nov 2, 2016 at 6:24 PM, Phil Tomson via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br></div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left-color:rgb(204,204,204);border-left-width:1px;border-left-style:solid"><div><div class="m_-1912501019048824133m_-2356569710591101224h5"><div dir="ltr"><div><div><div>We've recently moved our project from LLVM 3.6 to LLVM 3.9. I noticed one of our code generation tests is breaking in 3.9.<br><br></div>The test is:<br><br> ; RUN: llc < %s -march=xstg | FileCheck %s<br><br>define i64 @bclr64(i64 %a, i64 %b) nounwind readnone {<br>entry:<br>; CHECK: bclr r1, r0, r1, 64<br> %sub = sub i64 %b, 1<br> %shl = shl i64 1, %sub<br> %xor = xor i64 %shl, -1<br> %and = and i64 %a, %xor<br> ret i64 %and<br>}<br><br></div>I ran llc with -debug to get a better idea of what's going on and found:<br><br>Initial selection DAG: BB#0 'bclr64:entry'<br>SelectionDAG has 14 nodes:<br> t0: ch = EntryToken<br> t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0<br> t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1<br> t6: i64 = sub t4, Constant:i64<1><br> t7: i64 = shl Constant:i64<1>, t6<br> t9: i64 = xor t7, Constant:i64<-1><br> t10: i64 = and t2, t9<br> t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10<br> t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1<br><br><br><br>Combining: t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1<br><br>Combining: t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10<br><br>Combining: t11: i64 = Register %R1<br><br>Combining: t10: i64 = and t2, t9<br><br>Combining: t9: i64 = xor t7, Constant:i64<-1><br> ... into: t15: i64 = rotl Constant:i64<-2>, t6<br><br>Combining: t10: i64 = and t2, t15<br><br>Combining: t15: i64 = rotl Constant:i64<-2>, t6<br><br>Combining: t14: i64 = Constant<-2><br><br>Combining: t6: i64 = sub t4, Constant:i64<1><br> ... into: t17: i64 = add t4, Constant:i64<-1><br><br>Combining: t15: i64 = rotl Constant:i64<-2>, t17<br><br><br><br></div><div>These rotl instructions weren't showing up when I ran llc 3.6 and that's completely changing the generated code at the end which means the test fails (and it's less optimal than it was in 3.6). <br><br>I've been looking in the LLVM language docs (3.9 version) and I don't see any documentation on 'rotl'. What does it do? Why isn't it in the docs?<span class="m_-1912501019048824133m_-2356569710591101224m_2351564204507218430HOEnZb"><font color="#888888"><br><br></font></span></div><span class="m_-1912501019048824133m_-2356569710591101224m_2351564204507218430HOEnZb"><font color="#888888"><div>Phil<br></div></font></span></div>
<br></div></div>______________________________<wbr>_________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank" rel="noreferrer">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
<br></blockquote></div><br></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>