[llvm-dev] [RFC] Pagerando: Page-granularity code randomization

Wed Oct 10 10:39:34 PDT 2018

Hi all,

My thanks to everyone who has provided feedback, reviews, and
discussion both here and off-list, it is appreciated and has helped to
improve the feature. We’ve continued to develop Pagerando over the
last year, and an update is long overdue. The summary is that the
initial version of Pagerando, as described in the prior RFC
(http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html) is
implemented, tested, and stable for both ARM and AArch64.
Specifically, patches D37580-D37587 need reviews and feedback. We
would like to continue towards potential deployment for Android, which
means that the feature needs to live in-tree. We plan to continue to
improve and support this feature (see below), but we would like to
make sure that we will be able to eventually deploy this as part of
the platform toolchain.

Our target implementation and deployment platform for Pagerando is
Android, specifically system shared libraries. To this end, we are
working together with the Android team at Google to mature and
optimize Pagerando to meet the deployment constraints of AOSP. If
performance and overhead results are acceptable after further testing
and potential improvements, the plan is to deploy this in the Android
toolchain and platform alongside other mitigations also being rolled
out, such as CFI (barring any additional unforeseen issues). To move
forward with integrating Pagerando into the Android toolchain, we need
to figure out and resolve any issues that might block integration of
the feature into LLVM. I (and the Android security team) would like to
get Pagerando in-tree and continue to make additional improvements
from there.

We have built and tested AOSP images for both a Pixel and Pixel 2 with
Pagerando applied to all system shared libraries (with the exception
of pre-compiled vendor binaries). On both phones, we see very little
to no runtime performance overhead for most workloads. For example,
Vellamo has an average overhead of 2-4% and almost all overheads are
within a 95% confidence interval of the baseline. We found one major
outlier, an IPC micro-benchmark that spends much of its time making
function calls across DSO boundaries which had overhead up to 30%
(below we detail improvements that were specifically targeted for this
case). This case is mainly due to the higher fixed performance
overhead of making cross-DSO calls relative to the small amount of
time for the benchmark workload.

Memory and disk overheads of Pagerando are also important. With our
current implementation, combined disk overhead for all 32-bit and
64-bit system libraries in Android is 37% (about 102M). This is
partially due to a large number of unnecessary wrappers, which we will
be eliminating by compiling libraries with -fvisibility=hidden and
specifically exporting the intended external API. Total dynamic memory
overhead for Pagerando on Android after boot is about 25M which is
about a 17% increase in DSO mapped memory. This is much less than the
on-disk impact since many Android system libraries are rarely loaded.
We’re looking at deploying Pagerando for a subset of system libraries
used in privileged processes which would increase overall DSO memory
use by about 4% (6M on devices which commonly have 2-4GB RAM).

The Pagerando changes for LLVM consist of two major functional parts:
binning functions into page-aligned bins, and instrumenting externally
visible or address-taken functions with a wrapper that initializes a
Page Offset Table (POT) register. The former is specific to Pagerando,
however, the latter part of the feature may prove useful for other
projects which need to initialize a register upon entry from another
DSO. I’m not sure if there is currently another use-case for this kind
of feature, but it may prove useful to decouple wrappers from the rest
of the Pagerando changes to be useful for the wider LLVM community.

Feedback on our initial version of Pagerando has been positive.
Performance is always a concern, however. Much of the performance
impact is due to wrappers required to mix instrumented and
un-instrumented code. However, wrappers can be avoided in many
circumstances. With an optimized version of Pagerando using a single
unified POT instead of a per-DSO table, we can skip wrappers while
staying within code that at least reserves the POT register. Wrappers
are then only required for backward compatibility with 3rd-party DSOs.
We have implemented this optimized, unified POT extension to Pagerando
(https://github.com/immunant/android_llvm/tree/pagerando_unifiedpot)
and it speeds up the most-impacted benchmarks as expected. I haven’t
integrated this into the current patches on Phabricator for
simplicity, but we can easily add this after the initial
implementation lands, as it’s not too many additional changes. Further
optimizations are possible absent 3rd-party DSOs.

In conclusion, I would greatly appreciate any reviews of the patches,
especially focusing on fit and integration into LLVM as a whole, as
well as a specific review of the backend changes. The reviews so far
have been very helpful, but I’m not comfortable with something of this
complexity until we get more eyeballs on the code. I’m more than happy
to address any issues and improve the patches.

Thanks,
Stephen