Just a  brief follow-up, mostly relaying my findings from IRC:<div><br></div><div>I looked in depth at loop_unroll to see why it slowed down. The inliner run far 4.2% of the time, 2.7% of which was spent actually doing inlines. So the cost analysis is not hurting us here.</div>

<div><br></div><div>However, we are spending quite a bit of time in the optimizations I expect to benefit from better inlining: InstCombine, LSR, and GVN.</div><div><br></div><div>And, thankfully, we're getting significant runtime improvements from the time spent in these optimizers, so they aren't going of the deep end, they're actually simplifying code (if perhaps not as quickly as we'd like).</div>

<div><br></div><div>The conclusion seems to be that these patches are fine, and we just need to keep pressure on the scalar optimization passes to run as efficiently as possible. The improved inlining costs us compile time but seems to pay handsomely at runtime.</div>

<div><br></div><div>If folks have other significant compile-time regressions, I would be interested in having repro instructions. =]</div><div><br></div><div>Last but not least, the refactoring to do inline cost analysis per-callsite may actually make the analysis *faster* in several situations. As I'm going through this I'm finding lots of inefficiencies in the current design that should be fixed along the way.</div>

<div><br></div><div>-Chandler<br></div>