[Openmp-commits] [PATCH] D95819: [OpenMP] libomp cleanup: move fast allocation routines to kmp_tasking.cpp
Johannes Doerfert via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Sun Feb 7 13:32:18 PST 2021
jdoerfert added a comment.
In D95819#2547482 <https://reviews.llvm.org/D95819#2547482>, @AndreyChurbanov wrote:
> In D95819#2534991 <https://reviews.llvm.org/D95819#2534991>, @jdoerfert wrote:
>
>> I don't know if this is the right direction. Placing code based on call profiles seems to break the idea of modularity. I mean, `___kmp_fast_allocate` is now a "tasking" thing?
>> I didn't see a reply yet, what about LTO for the runtime?
>
> @jdoerfert, thanks for the hint. I've made some performance experiments on SpecOMP 2012 376.kdtree test (which was initial trigger of this patch), and the results showed the patch does give some performance on current library build, but hurts performance on lto build. Moreover, the performance gain disappear if I also apply diff from https://reviews.llvm.org/D95816 (named it patch2 in the following data).
>
> Some digits (time in sec):
> I used Intel 19 compiler + libomp on 2x Xeon Gold 6252 (48 cores, 48 threads used):
> trunk - 401
> trunk+patch - 395
> trunk+patch2 - 395
> trunk-lto - 385
> trunk+patch-lto 387
>
> Similar performance trend seen on other platforms I have for testing.
>
> So given that library built with lto gives better performance, and this patch hurts it, I am abandoning it.
Very interesting numbers!
Should we enable LTO for the runtime build by default (assuming the compiler + linker combo allow it)? I doubt the compile time hit is too bad, and the shown performance win would certainly be worth it.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D95819/new/
https://reviews.llvm.org/D95819
More information about the Openmp-commits
mailing list