<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Aug 22, 2016, at 2:52 PM, bryant <<a href="mailto:3.14472+reviews.llvm.org@gmail.com" class="">3.14472+reviews.llvm.org@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="">bryant added a comment.<br class=""><br class="">In <a href="https://reviews.llvm.org/D23253#522534" class="">https://reviews.llvm.org/D23253#522534</a>, @qcolombet wrote:<br class=""><br class=""><blockquote type="cite" class="">Hi,<br class=""></blockquote><br class=""><br class="">Thanks for actually reviewing this.<br class=""><br class=""><blockquote type="cite" class="">What is the performance impact (both runtime and compile time) of the pass?<br class=""></blockquote><br class=""><br class="">Run-time impact, according to IACA, would be 1% to 40% (`pre-ra-sched.ll`) faster. Never worse. I'll measure the compile-time perf tonight. Do you have any suggestions on the preferred way to do this?</div></div></blockquote><div><br class=""></div><div>Yes, do the compile-time and runtime measurements on the llvm test-suite (see my next answer).</div><br class=""><blockquote type="cite" class=""><div class=""><div class=""><br class=""><blockquote type="cite" class="">You list results for test/CodeGen/X86, which is not usually what we use for perform testing. Could you try on the LLVM test suite?<br class=""></blockquote><br class=""><br class="">My measurements have thus far centered around the performance of generated code. Since this pass only ever runs when -march=x86/-64, there would be no difference in the rest of the arches, improvements or otherwise. If you really want, I could run the rest of the suite but the result would be a bunch of passes and xfails. What sort of results do you really want me to record here?<br class=""></div></div></blockquote><div><br class=""></div><div>I meant the LLVM test-suite:</div><div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><a href="http://llvm.org/svn/llvm-project/test-suite/trunk" class="">http://llvm.org/svn/llvm-project/test-suite/trunk</a></span></div></div><div><br class=""></div>Not the tests from other targets :).</div><div><br class=""></div><div>The listed repository has a bunch of benchmark that we use for performance comparison.</div><div><a href="http://llvm.org/docs/TestingGuide.html" class="">http://llvm.org/docs/TestingGuide.html</a></div><div><br class=""><blockquote type="cite" class=""><div class=""><div class=""><br class=""><blockquote type="cite" class="">Moreover, like Michael said, we should try to merge this pass with something else. A possible approach would be a peephole like optimization where we can plug more patterns.<br class=""></blockquote><br class=""><br class="">As I've mentioned before, this depends on virtual register LiveInterval information that is lost after the rewrite pass.<br class=""></div></div></blockquote><div><br class=""></div><div>I don’t see why this is a problem. We have other means to compute liveness and if we really need to run that pass at this specific point, we could have such a mode in that hypothetical peephole.</div><div>By the way, have you consider running the pass after rewrite and use LivePhysReg utility?</div><div>I believe that should work and probably make the code simpler.</div><br class=""><blockquote type="cite" class=""><div class=""><div class=""><br class=""><blockquote type="cite" class="">Cheers,<br class=""></blockquote><br class=""><blockquote type="cite" class="">-Quentin<br class=""></blockquote><br class=""><br class=""><br class=""><br class=""><br class="">================<br class="">Comment at: lib/CodeGen/CalcSpillWeights.cpp:46-48<br class="">@@ -45,5 +45,5 @@<br class=""> // Return the preferred allocation register for reg, given a COPY instruction.<br class="">-static unsigned copyHint(const MachineInstr *mi, unsigned reg,<br class="">-                         const TargetRegisterInfo &tri,<br class="">-                         const MachineRegisterInfo &mri) {<br class="">+unsigned VirtRegAuxInfo::copyHint(const MachineInstr *mi, unsigned reg,<br class="">+                                  const TargetRegisterInfo &tri,<br class="">+                                  const MachineRegisterInfo &mri) {<br class="">   unsigned sub, hreg, hsub;<br class="">----------------<br class="">qcolombet wrote:<br class=""><blockquote type="cite" class="">This seems wrong to export this interface.<br class="">External users should rely on MachineRegisterInfo::getSimpleHint.<br class=""><br class="">What is the problem in using MachineRegisterInfo::getSimpleHint?<br class=""></blockquote>In cases where there is more than one hint, getSimpleHint only returns one and misses the rest.<br class=""></div></div></blockquote><div><br class=""></div><div>Each virtual register has only one hint in the current scheme. I actually don’t see how you would get different results with copyHint (this is what is used to populate those hints). Could you elaborate on the cases you saw differences?</div><div><br class=""></div><div>Thanks,</div><div>-Quentin</div><br class=""><blockquote type="cite" class=""><div class=""><div class=""><br class=""><br class="">Repository:<br class="">  rL LLVM<br class=""><br class=""><a href="https://reviews.llvm.org/D23253" class="">https://reviews.llvm.org/D23253</a><br class=""><br class=""><br class=""><br class=""></div></div></blockquote></div><br class=""></body></html>