<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Dec 18, 2016, at 1:00 PM, Finkel, Hal J. via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">


<meta http-equiv="Content-Type" content="text/html; charset=utf-8" class="">


<div class="">

<div class="">

<div class=""></div>

<div class=""><i class=""><font style="color:#333333" class="">Sent from my Verizon Wireless 4G LTE DROID</font></i></div>

<div class=""><br class="">

</div>

<div class="">On Dec 18, 2016 2:56 PM, via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:</div>

<div class="">></div>

<div class="">></div>

<div class="">> > On Dec 17, 2016, at 1:35 PM, Davide Italiano via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:</div>

<div class="">> > </div>

<div class="">> > First of all, sorry for the long mail.</div>

<div class="">> > Inspired by the excellent analysis Rui did for lld, I decided to do</div>

<div class="">> > the same for llvm.</div>

<div class="">> > I'm personally very interested in build-time for LTO configuration,</div>

<div class="">> > with particular attention to the time spent in the optimizer.</div>

<div class="">></div>

<div class="">> From our own offline regression testing, one of the biggest culprits in our experience is Instcombine’s known bits calculation. A number of new known bits checks have been added in the past few years (e.g. to infer nuw, nsw, etc on various instructions)

 and the cost adds up quite a lot, because *the cost is paid even if Instcombine does nothing*, since it’s a significant cost on visiting every relevant instruction.</div>

<div class=""><br class="">

</div>

<div class="">FWIW, I've started working on a patch to add a cache for InstCombine's (ValueTracking's) known-bits calculation. I hope to have it ready for posting soon.</div></div></div></div></blockquote><div><br class=""></div></div>That sounds great! Last time I looked into compiletime ~10 months ago I also saw computeKnownBits as the biggest performance problem.<div class="">Little things like load/store optimization calling computeknownbits in an attempt to improve the alignment predictions on the loads/store leading to many nodes getting queried over and over again. Feel free to add me as a reviewer!</div><div class=""><br class=""></div><div class="">- Matthias</div></body></html>