[PATCH] D94110: [CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation
Wei Mi via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 27 22:35:29 PST 2021
wmi added a comment.
> Sum-up:
>
> - LBR-Trie and Global-Trie can have about 2x speed-up over the baseline
> - Global-Trie have slight regression(about 10%) against LBR-Trie as we discussed this might be caused by hash overhead.
Thanks for the experiment.
LBR-Trie is faster than Global-Trie. Does it mean there is not enough callstack overlap between different LBR samples? And could you elaborate what is the hash overhead?
> Thanks for the quick experiment! Given that we don't see immediate speed up from global trie, I'm inclined to just use what you have in this patch, and defer further improvement for the future. What do you think?
I assume the time in the table is in seconds -- 19.15 seconds for sjeng using LBR-Trie, and that is not very long. I agree with Wenlei you can leave the improvement for the future.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D94110/new/
https://reviews.llvm.org/D94110
More information about the llvm-commits
mailing list