[PATCH] D94110: [CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation

Wei Mi via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 27 22:35:29 PST 2021


wmi added a comment.

> Sum-up:
>
> - LBR-Trie and Global-Trie can have about 2x speed-up over the baseline
> - Global-Trie have slight regression(about 10%) against LBR-Trie as we discussed this might be caused by hash overhead.

Thanks for the experiment. 
LBR-Trie is faster than Global-Trie. Does it mean there is not enough callstack overlap between different LBR samples? And could you elaborate what is the hash overhead?

> Thanks for the quick experiment! Given that we don't see immediate speed up from global trie, I'm inclined to just use what you have in this patch, and defer further improvement for the future. What do you think?

I assume the time in the table is in seconds -- 19.15 seconds for sjeng using LBR-Trie, and that is not very long. I agree with Wenlei you can leave the improvement for the future.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94110/new/

https://reviews.llvm.org/D94110



More information about the llvm-commits mailing list