[PATCH] D17555: [Feedback requested] Implement cold spliting
Xinliang David Li via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 24 13:15:37 PST 2016
On Wed, Feb 24, 2016 at 1:04 PM, Sean Silva <chisophugis at gmail.com> wrote:
> silvas added a subscriber: silvas.
> silvas added a comment.
> I can see why this would help iTLB/paging, but I'm not grokking why it
> would help icache very much compared to per-function machine block
> placement ensuring that the cold stuff ends up at the end on a separate
> cacheline (does MBP already do that?). In fact (playing devil's advocate)
> the MBP approach could be more beneficial because it could allow branches
> to be relaxed to smaller encodings.
> The scenarios I can see this being a substantial win for icache over MBP
> is when you e.g. have two functions with 1.5 cachelines of hot text (and
> say 1 cacheline of cold text). With MBP, each function would end up using
> ceiling(1.5) = 2 cachelines for the hot and one cacheline for the cold, but
> with the splitting the linker would see 2x 1.5 cacheline hot + 2x 1
> cachline cold and so you could put the two 1.5's together and only use 3
> cachelines for the hot part. How often does that occur (and does the linker
> actually manage to exploit this?).
yes -- the latest prefix based function grouping patch will teach linker to
> Since the benefit is based on the "rounding", we save at most just under
> ("just under" is determined by the text alignment) one cacheline every time
> we can pack these densely. The benefit is at most #hotFunctions *
> (sizeof(Cacheline) - alignof(Function)) text size for the hot working set.
On x86 is 64bytes in size (small), and it can reduce the overall savings.
Another factor is the use of long jumps with function splitting can offset
some of the savings here.
> That being said, this kind of low-level function splitting is a really
> powerful tool and I fully support adding it, but I agree with Mehdi that
> I'd like to see some supporting benchmark results.
Agree -- I think this functionality is useful to have but probably off by
default and controlled by an user option.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits