<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">What is the advantage of preventing combining the rotl instead of teach the selection to match this extended pattern that includes the rotl and generates the bclr code?<div class=""><br class=""></div><div class="">— </div><div class="">Mehdi</div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Nov 3, 2016, at 4:41 PM, Krzysztof Parzyszek via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">One option may be to prevent the formation of ROTL, if possible, and then generating rol by hand.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">Marking it as "expand" would likely stop the DAG combiner from creating it. Then you could "preprocess" the selection DAG before the instruction selection and do the pattern matching yourself.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">-Krzysztof</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">On 11/3/2016 4:24 PM, Phil Tomson via llvm-dev wrote:</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">I could try setting ISD::ROTL to Expand... however, we do have a rol op<br class="">and we'd like the ISD::ROTL to map to it.  If I set it to Expand it's<br class="">not going to do that, right?<br class=""><br class="">I think in this case we're just getting the ISD::ROTL a bit too soon in<br class="">the process and that's causing us to miss other optimization<br class="">opportunities later on.<br class=""><br class="">Phil<br class=""><br class="">On Thu, Nov 3, 2016 at 2:20 PM, Ryan Taylor <<a href="mailto:ryta1203@gmail.com" class="">ryta1203@gmail.com</a><br class=""><<a href="mailto:ryta1203@gmail.com" class="">mailto:ryta1203@gmail.com</a>>> wrote:<br class=""><br class="">   Setting the ISD::ROTL to Expand doesn't work? (via SetOperation)<br class=""><br class="">   You could also do a Custom hook if that's what you're looking for.<br class=""><br class="">   On Thu, Nov 3, 2016 at 5:12 PM, Phil Tomson <<a href="mailto:phil.a.tomson@gmail.com" class="">phil.a.tomson@gmail.com</a><br class="">   <<a href="mailto:phil.a.tomson@gmail.com" class="">mailto:phil.a.tomson@gmail.com</a>>> wrote:<br class=""><br class="">       ... or perhaps to rephrase:<br class=""><br class="">       In 3.9 it seems to be doing a smaller combine much sooner,<br class="">       whereas in 3.6 it deferred that till later in the instruction<br class="">       selection pattern matching - the latter was giving us better<br class="">       results because it seems to match a larger pattern than the<br class="">       former did in the earlier stage.<br class=""><br class="">       Phil<br class=""><br class="">       On Thu, Nov 3, 2016 at 2:07 PM, Phil Tomson<br class="">       <<a href="mailto:phil.a.tomson@gmail.com" class="">phil.a.tomson@gmail.com</a><span class="Apple-converted-space"> </span><<a href="mailto:phil.a.tomson@gmail.com" class="">mailto:phil.a.tomson@gmail.com</a>>> wrote:<br class=""><br class="">           Is there any way to get it to delay this optimization where<br class="">           it goes from this:<br class=""><br class="">           Initial selection DAG: BB#0 'bclr64:entry'<br class="">           SelectionDAG has 14 nodes:<br class="">             t0: ch = EntryToken<br class="">                 t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0<br class="">                       t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1<br class="">                     t6: i64 = sub t4, Constant:i64<1><br class="">                   t7: i64 = shl Constant:i64<1>, t6<br class="">                 t9: i64 = xor t7, Constant:i64<-1><br class="">               t10: i64 = and t2, t9<br class="">             t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10<br class="">             t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1<br class=""><br class=""><br class=""><br class="">           Combining: t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1<br class=""><br class="">           Combining: t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10<br class=""><br class="">           Combining: t11: i64 = Register %R1<br class=""><br class="">           Combining: t10: i64 = and t2, t9<br class=""><br class="">           Combining: t9: i64 = xor t7, Constant:i64<-1><br class="">            ... into: t15: i64 = rotl Constant:i64<-2>, t6<br class=""><br class="">           ...to this:<br class=""><br class="">           Optimized lowered selection DAG: BB#0 'bclr64:entry'<br class="">           SelectionDAG has 13 nodes:<br class="">             t0: ch = EntryToken<br class="">                 t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0<br class="">                     t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1<br class="">                   t17: i64 = add t4, Constant:i64<-1><br class="">                 t15: i64 = rotl Constant:i64<-2>, t17<br class="">               t10: i64 = and t2, t15<br class="">             t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10<br class="">             t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1<br class=""><br class=""><br class="">           That combining of the xor & and there ends up giving us<br class="">           suboptimal results as compared with 3.6.<br class=""><br class="">           For example, in 3.6 the generated code is simply:<br class=""><br class="">           bclr64:                                 # @bclr64<br class="">           # BB#0:                                 # %entry<br class="">               addI    r1, r1, -1, 64<br class="">               bclr        r1, r0, r1, 64<br class="">               jabs        r511<br class=""><br class="">           Whereas with 3.9 the generated code is:<br class=""><br class="">           bclr64:                                 # @bclr64<br class="">           # BB#0:                                 # %entry<br class="">               addI    r1, r1, -1, 64<br class="">               movimm        r2, -2, 64<br class="">               rol        r1, r2, r1, 64<br class="">               bitop1        r1, r0, r1, AND, 64<br class="">               jabs        r511<br class=""><br class=""><br class="">           ... it seems to be negatively impacting some of our larger<br class="">           benchmarks as well that used to contains several bclr (bit<br class="">           clear) commands but now contain much less.<br class=""><br class="">           Phil<br class=""><br class=""><br class=""><br class=""><br class="">           On Wed, Nov 2, 2016 at 4:10 PM, Ryan Taylor<br class="">           <<a href="mailto:ryta1203@gmail.com" class="">ryta1203@gmail.com</a><span class="Apple-converted-space"> </span><<a href="mailto:ryta1203@gmail.com" class="">mailto:ryta1203@gmail.com</a>>> wrote:<br class=""><br class="">               I believe some of the ISDs were introduced to allow for<br class="">               DAG optimizations under the assumption that some of the<br class="">               major architectures directly support these types of<br class="">               instructions.<br class=""><br class="">               -Ryan<br class=""><br class="">               On Wed, Nov 2, 2016 at 6:24 PM, Phil Tomson via llvm-dev<br class="">               <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a><br class="">               <<a href="mailto:llvm-dev@lists.llvm.org" class="">mailto:llvm-dev@lists.llvm.org</a>>> wrote:<br class=""><br class="">                   We've recently moved our project from LLVM 3.6 to<br class="">                   LLVM 3.9.  I noticed  one of our code generation<br class="">                   tests is breaking in 3.9.<br class=""><br class="">                   The test is:<br class=""><br class="">                    ; RUN: llc < %s -march=xstg | FileCheck %s<br class=""><br class="">                   define i64 @bclr64(i64 %a, i64 %b) nounwind readnone {<br class="">                   entry:<br class="">                   ; CHECK: bclr     r1, r0, r1, 64<br class="">                     %sub = sub i64 %b, 1<br class="">                     %shl = shl i64 1, %sub<br class="">                     %xor = xor i64 %shl, -1<br class="">                     %and = and i64 %a, %xor<br class="">                     ret i64 %and<br class="">                   }<br class=""><br class="">                   I ran llc with -debug to get a better idea of what's<br class="">                   going on and found:<br class=""><br class="">                   Initial selection DAG: BB#0 'bclr64:entry'<br class="">                   SelectionDAG has 14 nodes:<br class="">                     t0: ch = EntryToken<br class="">                         t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0<br class="">                               t4: i64,ch = CopyFromReg t0,<br class="">                   Register:i64 %vreg1<br class="">                             t6: i64 = sub t4, Constant:i64<1><br class="">                           t7: i64 = shl Constant:i64<1>, t6<br class="">                         t9: i64 = xor t7, Constant:i64<-1><br class="">                       t10: i64 = and t2, t9<br class="">                     t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10<br class="">                     t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1<br class=""><br class=""><br class=""><br class="">                   Combining: t13: ch = XSTGISD::Ret t12, Register:i64<br class="">                   %R1, t12:1<br class=""><br class="">                   Combining: t12: ch,glue = CopyToReg t0, Register:i64<br class="">                   %R1, t10<br class=""><br class="">                   Combining: t11: i64 = Register %R1<br class=""><br class="">                   Combining: t10: i64 = and t2, t9<br class=""><br class="">                   Combining: t9: i64 = xor t7, Constant:i64<-1><br class="">                    ... into: t15: i64 = rotl Constant:i64<-2>, t6<br class=""><br class="">                   Combining: t10: i64 = and t2, t15<br class=""><br class="">                   Combining: t15: i64 = rotl Constant:i64<-2>, t6<br class=""><br class="">                   Combining: t14: i64 = Constant<-2><br class=""><br class="">                   Combining: t6: i64 = sub t4, Constant:i64<1><br class="">                    ... into: t17: i64 = add t4, Constant:i64<-1><br class=""><br class="">                   Combining: t15: i64 = rotl Constant:i64<-2>, t17<br class=""><br class=""><br class=""><br class="">                   These rotl instructions weren't showing up when I<br class="">                   ran llc 3.6 and that's completely changing the<br class="">                   generated code at the end which means the test fails<br class="">                   (and it's less optimal than it was in 3.6).<br class=""><br class="">                   I've been looking in the LLVM language docs (3.9<br class="">                   version) and I don't see any documentation on<br class="">                   'rotl'. What does it do? Why isn't it in the docs?<br class=""><br class="">                   Phil<br class=""><br class="">                   _______________________________________________<br class="">                   LLVM Developers mailing list<br class="">                   <a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a><span class="Apple-converted-space"> </span><<a href="mailto:llvm-dev@lists.llvm.org" class="">mailto:llvm-dev@lists.llvm.org</a>><br class="">                   <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br class="">                   <<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>><br class=""><br class=""><br class=""><br class=""><br class=""><br class=""><br class=""><br class=""><br class="">_______________________________________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br class=""><br class=""></blockquote><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">_______________________________________________</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">LLVM Developers mailing list</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><a href="mailto:llvm-dev@lists.llvm.org" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">llvm-dev@lists.llvm.org</a><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></div></blockquote></div><br class=""></div></body></html>