[libcxx-commits] [PATCH] D87881: [libunwind] Optimize dl_iterate_phdr's findUnwindSectionsByPhdr
Ryan Prichard via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Thu Sep 17 21:31:41 PDT 2020
rprichard added a comment.
I uploaded the Android benchmark I used: https://gist.github.com/rprichard/1688de36f0132a40294adb0409189533.
Results on a Pixel 3 blueline device, running Q, with `taskset 10`:
Before:
ninja: Entering directory `out'
[102/102] /x/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/...so libbench_098.so libbench_099.so libbench_100.so libbench_final.so
out/: 105 files pushed, 0 skipped. 44.9 MB/s (2362029 bytes in 0.050s)
/x/multitime-android/multitime: 1 file pushed, 0 skipped. 226.5 MB/s (123912 bytes in 0.001s)
===> /data/local/tmp/multitime results
1: /data/local/tmp/out/main
Mean Std.Dev. Min Median Max
real 0.4897+/-0.04305 0.1671 0.1677 0.4727 0.7813
user 0.4815+/-0.04297 0.1668 0.1533 0.4617 0.7733
sys 0.0048+/-0.00074 0.0029 0.0000 0.0033 0.0133
After:
ninja: Entering directory `out'
[102/102] /x/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/...so libbench_098.so libbench_099.so libbench_100.so libbench_final.so
out/: 105 files pushed, 0 skipped. 46.3 MB/s (2362029 bytes in 0.049s)
/x/multitime-android/multitime: 1 file pushed, 0 skipped. 247.7 MB/s (123912 bytes in 0.000s)
===> /data/local/tmp/multitime results
1: /data/local/tmp/out/main
Mean Std.Dev. Min Median Max
real 0.3254+/-0.02297 0.0892 0.1630 0.3386 0.4572
user 0.3174+/-0.02284 0.0887 0.1567 0.3333 0.4500
sys 0.0051+/-0.00067 0.0026 0.0000 0.0067 0.0133
The large variation between min and max run-times comes from the `cbdata->targetAddr < pinfo->dlpi_addr` optimization in `findUnwindSectionsByPhdr`. On Bionic, when the dynamic loader loads a group of DSOs, all at once, it randomizes the order in which the files are mapped into memory, so the order of DSOs with respective to each other varies from one run to the next. In other respects, the order of the libraries is deterministic -- e.g. the order that constructors are called, and the order in which `dl_iterate_phdr` iterates over the DSOs.
FWIW: An Android NDK user did complain about the performance of EH with many libraries: https://github.com/android/ndk/issues/1062.
I'm still interested in making the FrameHeaderCache actually work on Android, but for now an NDK app can't use the cache because the dlpi_adds/dlpi_subs fields aren't present in most Android versions.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D87881/new/
https://reviews.llvm.org/D87881
More information about the libcxx-commits
mailing list