<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Oct 1, 2014, at 11:07 AM, Serge Pavlov <<a href="mailto:sepavloff@gmail.com" class="">sepavloff@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Hi Quentin,<br class=""><div class="gmail_extra"><br clear="all" class=""><div class="">2014-10-02 0:21 GMT+07:00 Quentin Colombet <span dir="ltr" class=""><<a href="mailto:qcolombet@apple.com" target="_blank" class="">qcolombet@apple.com</a>></span>:<br class=""></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word" class=""><div class=""><div class=""><span class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word" class=""><div class=""><div class=""><div class=""><div class=""><div class=""><div class=""><blockquote type="cite" class=""><div class=""><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important" class="">The constant hoisting pass does this kind of things. Should we try to teach it to handle this kind of cases?</span></div></blockquote></div></div></div></div></div></div></div></blockquote><div class="">That would be interesting. However this pass is x86 specific and can use processor features (subregister structure, loading 64-bit value with 32-bit move). Can theses features be used by constant hoisting? </div></div></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">Maybe. This pass has a bunch of target hooks if I remember correctly. Juergen would know better :).</div></div></div></div></blockquote><div class=""><br class=""></div><div class="">Ok, looking forward to advice:)</div><div class=""> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word" class=""><div class=""><div class=""><span class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word" class=""><div class=""><div class=""><div class=""><div class=""><div class=""><div class=""><blockquote type="cite" class=""><div class=""><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important" class="">Moreover, this may be beneficial for code size, but I guess it is generally not beneficial for performances. Therefore, I believe this should be done for functions with the Os or Oz attributes only.</span></div></blockquote></div></div></div></div></div></div></div></blockquote><div class="">Just curious, why? Moves from register must be faster than move from memory.</div></div></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">Yes, but those are moves from immediate, which does not require memory at all.</div></div></div></div></blockquote><div class=""><br class=""></div><div class="">On the other hand it loads memory bus.</div></div></div></div></div></blockquote><div><br class=""></div><div>You mean by its encoding size or am I missing something?</div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word" class=""><div class=""><div class=""><div class="">My performance concerns are:</div><div class="">- Register pressure, like Rafael mentioned.</div><div class="">- Additional scheduling dependencies.</div><div class=""><br class=""></div><div class="">Going back to your example:</div><div class="">This yields two independent chain of computation that can be scheduled independently. Moreover, you need just one register to realize this sequence.</div><div class=""><span class="">  mov $0, 0x4(%esi)<br class="">   mov $0, 0x8(%esi)<br class=""><br class=""></span>The two sequences of computations have now to wait for the first mov immediate. Moreover, this sequence requires 2 registers.<span class=""><br class="">   mov $0, %eax<br class="">   mov %eax, 0x4(%esi)<br class="">   mov %eax, 0x8(%esi)</span></div><div class=""><br class=""></div></div></div></div></blockquote><div class=""><br class=""></div><div class="">Looks reasonable, thank you for explanation.</div><div class=""> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word" class=""><div class=""><div class=""><div class=""></div><span class=""><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""> Both gcc and icc use moves from register when compiling with optimization.  </div></div></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">Sure. What I am saying is that generally speaking, trading an immediate to register copy against a register to register copy does not sound like beneficial to me. </div><div class=""><br class=""></div><div class="">Except from code size improvements, what kind of improvements are you seeing?</div></div></div></div></blockquote><div class=""> </div><div class="">The main goal was code size. In fact the main interest is optimization of memset, the problem is described in <a href="http://llvm.org/bugs/show_bug.cgi?id=5124" class="">http://llvm.org/bugs/show_bug.cgi?id=5124</a>.  </div></div></div></div></div></blockquote><div><br class=""></div><div>Thanks for the context.</div><div>This seems to confirm that, at least at first, we should do that just for Os and Oz functions (i.e., look for function attribute: OptimizeForSize and MinSize<span style="background-color: rgb(255, 255, 255);" class="">).</span></div><div><span style="background-color: rgb(255, 255, 255);" class=""><br class=""></span></div><div><span style="background-color: rgb(255, 255, 255);" class="">At this point, I’ll wait for Juergen's feedbacks on constant hoisting before doing anything.</span></div><div><span style="background-color: rgb(255, 255, 255);" class="">Based on his answer, we will see how to move forward.</span></div><div><span style="background-color: rgb(255, 255, 255);" class=""><br class=""></span></div><div><span style="background-color: rgb(255, 255, 255);" class="">Thanks again,</span></div><div><span style="background-color: rgb(255, 255, 255);" class="">-Quentin</span></div><br class=""><blockquote type="cite" class=""><div dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""><br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word" class=""><div class=""><div class=""><div class="">Also, how big are those improvements?</div><div class=""><br class=""></div></div></div></div></blockquote><div class=""><br class=""></div><div class="">Compilation of PHP distribution with and without this pass shows size reduction about 0.3%.</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""> --Serge</div></div></div></div>
</blockquote></div><br class=""></body></html>