<html><body bgcolor="#FFFFFF"><div>I am the designer for open64 hwloop structure, but I am not a student.</div><div><br></div><div>Hope the following helps:</div><div><br></div><div>To transform a loop into hwloop, we need the help from optimizer. For example, </div><div><span class="Apple-style-span" style="font-family: sans-serif; -webkit-tap-highlight-color: rgba(26, 26, 26, 0.296875); -webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); -webkit-composition-frame-color: rgba(77, 128, 180, 0.230469); font-size: 13px; line-height: 19px; "><code lang="text" style="background-color: rgb(249, 249, 249); "><pre style="padding-top: 1em; padding-right: 1em; padding-bottom: 1em; padding-left: 1em; border-top-width: 1px; border-right-width: 1px; border-bottom-width: 1px; border-left-width: 1px; border-top-style: dashed; border-right-style: dashed; border-bottom-style: dashed; border-left-style: dashed; border-top-color: rgb(47, 111, 171); border-right-color: rgb(47, 111, 171); border-bottom-color: rgb(47, 111, 171); border-left-color: rgb(47, 111, 171); color: black; background-color: rgb(249, 249, 249); line-height: 1.1em; ">   while(k3>=10){
     sum+=k1;
     k3 --;
   }
</pre></code><p style="margin-top: 0.4em; margin-right: 0px; margin-bottom: 0.5em; margin-left: 0px; line-height: 1.5em; ">into the form:<code lang="text" style="background-color: rgb(249, 249, 249); "></code></p><code lang="text" style="background-color: rgb(249, 249, 249); "><pre style="padding-top: 1em; padding-right: 1em; padding-bottom: 1em; padding-left: 1em; border-top-width: 1px; border-right-width: 1px; border-bottom-width: 1px; border-left-width: 1px; border-top-style: dashed; border-right-style: dashed; border-bottom-style: dashed; border-left-style: dashed; border-top-color: rgb(47, 111, 171); border-right-color: rgb(47, 111, 171); border-bottom-color: rgb(47, 111, 171); border-left-color: rgb(47, 111, 171); color: black; background-color: rgb(249, 249, 249); line-height: 1.1em; ">   zdl_loop(k3-9) {
      sum+=k1;
   }
</pre></code><p style="margin-top: 0.4em; margin-right: 0px; margin-bottom: 0.5em; margin-left: 0px; line-height: 1.5em; ">So, we introduce a new ZDLBR whirl(open64 optimizer intermediate) operator, which represents the loop in whirl as:<code lang="text" style="background-color: rgb(249, 249, 249); "></code></p><code lang="text" style="background-color: rgb(249, 249, 249); "><pre style="padding-top: 1em; padding-right: 1em; padding-bottom: 1em; padding-left: 1em; border-top-width: 1px; border-right-width: 1px; border-bottom-width: 1px; border-left-width: 1px; border-top-style: dashed; border-right-style: dashed; border-bottom-style: dashed; border-left-style: dashed; border-top-color: rgb(47, 111, 171); border-right-color: rgb(47, 111, 171); border-bottom-color: rgb(47, 111, 171); border-left-color: rgb(47, 111, 171); color: black; background-color: rgb(249, 249, 249); line-height: 1.1em; ">LABEL L2050 0 {line: 0}
LOOP_INFO 0 1 1
   I4I4LDID 73 <1,2,.preg_I4> T<4,.predef_I4,4> # k3
   I4I4LDID 77 <1,2,.preg_I4> T<4,.predef_I4,4> # <preg> 
 END_LOOP_INFO
   I4I4LDID 74 <1,2,.preg_I4> T<4,.predef_I4,4> # k1
   I4I4LDID 75 <1,2,.preg_I4> T<4,.predef_I4,4> # sum
  I4ADD
 I4STID 75 <1,2,.preg_I4> T<4,.predef_I4,4> # sum {line: 5}
 ZDLBR L2050 {line: 0}
</pre><div>Then, we let cg do things. Such a design abstract the general operations in optimizer, while target specific part in cg, still a simulated op, until cg loop optimization finished. We implement a multi nested level hwloop by this approach. Gcc's 3 doloop expand names do the same, we believe.</div></code></span><div><br></div>More details, please take a look at</div><div><br></div><div><a href="http://wiki.open64.net/index.php/Zero_Delay_Loop">http://wiki.open64.net/index.php/Zero_Delay_Loop</a></div><div><br></div><div>Thanks<br>Gang</div><div><br>在 2012-11-22,19:00,Ivan Llopard <<a href="mailto:ivanllopard@gmail.com">ivanllopard@gmail.com</a>> 写道:<br><br></div><div></div><blockquote type="cite"><div><span>Hi Shuxin, Eli,</span><br><span></span><br><span>On 22/11/2012 03:19, Shuxin Yang wrote:</span><br><blockquote type="cite"><span>Hi, Ivan:</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>    My $0.02. hasZeroCostLooping() disabling unrolling dose not seem to be</span><br></blockquote><blockquote type="cite"><span>appropriate for other architectures, at least the one I worked before.</span><br></blockquote><span></span><br><span>I appreciate your feed-back. Could you give an example where building a hw loop is not appropriate for your target?</span><br><span></span><br><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>   You mentioned:</span><br></blockquote><blockquote type="cite"><span>>Currently, we cannot detect them because the loop unroller is</span><br></blockquote><blockquote type="cite"><span>>unrolling them before entering into the codegen. Looking at its implementation,</span><br></blockquote><blockquote type="cite"><span>>it.</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>  Could you please articulate why CG fail to recognize it?</span><br></blockquote><span></span><br><span>Well, just because the loop unrolling pass runs before the CG is called.</span><br><span></span><br><blockquote type="cite"><span> I remember in gcc, recognizing hw loop is in a RTL pass, and in Open64, one</span><br></blockquote><blockquote type="cite"><span>student(?) added some stuff in Scalar Opt, instead of CodeGen, just for HW loop.</span><br></blockquote><blockquote type="cite"><span>I recalled there is only one reason sounds valid -- prevent the loop become</span><br></blockquote><blockquote type="cite"><span>too big to fit in HW constraint.</span><br></blockquote><span></span><br><span>It sounds very similar to our implementation. We've implemented the hw loop builder at IR level, just before isel, with new intrinsics that provide hw loops semantics. While intrinsics may look a bit tricky and additional isel code is needed to recognize them, it benefits from the current scalar evolution functionalities to detect trip counts. Therefore, it's based on the same interface as loop unroller but, for architectural issues, we have stronger constraints: e.g. we cannot build hw loops on loops with multiple exits.</span><br><span></span><br><span>The loop topology is important and our hw loop builder depends on it. I agree that hasZeroCostLoop may seem too restrictive.</span><br><span>What about something like hasZeroCostLoopTopology(Loop *L, unsigned TripCount) to complement the first one ?</span><br><span></span><br><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>   The cost implied by hasZeroCostLoop() highly depends on the underlying architecture;</span><br></blockquote><blockquote type="cite"><span>therefore the higher level opts don't know how to utilize this interface for cost modeling.</span><br></blockquote><blockquote type="cite"><span>Maybe we can add a pretty vague interface, say</span><br></blockquote><blockquote type="cite"><span>   hw-please-advice-unrolling-factor(the loop, current-unrolling-factor),</span><br></blockquote><blockquote type="cite"><span>to encapsulate whatever reasons the arch might have to curtail aggressive unrolling?</span><br></blockquote><span></span><br><span>There are already some internals parameters in loop unroller to drive the heuristics. We use -unroll-count to skip unrolling.</span><br><span>But someone may want to enable unrolling even if the target says otherwise. IMHO, each target could provide internal flags to disable hw loop building and let the unroller works "normally".</span><br><span></span><br><span>Ivan</span><br><span></span><br><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>   I'm LLVM newbie, so don't take my words seriously.</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>Have a happy holiday!</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>Shuxin</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>On 11/21/2012 02:19 PM, Ivan Llopard wrote:</span><br></blockquote><blockquote type="cite"><blockquote type="cite"><span>Hi Hal,</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>On 21/11/2012 22:38, Hal Finkel wrote:</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>----- Original Message -----</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>From: "Ivan Llopard" <<a href="mailto:ivanllopard@gmail.com">ivanllopard@gmail.com</a>></span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>To: "LLVM Developers Mailing List" <<a href="mailto:llvmdev@cs.uiuc.edu">llvmdev@cs.uiuc.edu</a>></span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Sent: Wednesday, November 21, 2012 10:31:07 AM</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Subject: [LLVMdev] Disable loop unroll pass</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Hi,</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>We've a target which has hardware support for zero-overhead loops.</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Currently, we cannot detect them because the loop unroller is</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>unrolling</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>them before entering into the codegen. Looking at its implementation,</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>it</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>seems that it checks if it is profitable to unroll it or not based on</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>certain parameters.</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Given that zero cost loops building is based more or less on the same</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>constraints that loop unroll pass, I wonder if it is reasonable to</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>add</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>yet another target hook to prevent loop unrolling (something like</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>hasZeroOverheadLooping or hasZeroCostLooping) for targets that</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>support</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>zero-cost looping.</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Ivan,</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Please feel free to extend the ScalarTargetTransformInfo interface (in include/llvm/TargetTransformInfo.h) to provide target-customizable parameters to the loop unroller. This is on my TODO list, but if you'd like to work on this, that would be great.</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Sure! I'll propose a patch ASAP.</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Are there any cases in which loop unrolling is beneficial on your target?</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>I'd say that it's always beneficial to emit hardware loops whenever possible, either for static or dynamic trip counts, whether we look for smaller or faster code.</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>Ivan</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>  -Hal</span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Does Hexagon provides the same loop support? How have you addressed</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>this?</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>Ivan</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>_______________________________________________</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span>LLVM Developers mailing list</span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span><a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu"><a href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a></a></span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a></span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"><span></span><br></blockquote></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>_______________________________________________</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span>LLVM Developers mailing list</span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span><a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu"><a href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a></a></span><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><span><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a></span><br></blockquote></blockquote><blockquote type="cite"><span></span><br></blockquote><span></span><br><span>_______________________________________________</span><br><span>LLVM Developers mailing list</span><br><span><a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu"><a href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a></a></span><br><span><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a></span><br></div></blockquote></body></html>