[llvm-dev] [RFC] Pagerando: Page-granularity code randomization

Stephen Crane via llvm-dev llvm-dev at lists.llvm.org
Tue Jun 6 10:55:09 PDT 2017

This RFC describes pagerando, an improvement upon ASLR for shared
libraries. We're planning to submit this work for upstreaming and
would appreciate feedback before we get to the patch submission stage.

Pagerando randomizes the location of individual memory pages (ASLR
only randomizes the library base address). This increases security
against code-reuse attacks (such as ROP) by tolerating pointer leaks.
Pagerando splits libraries into page-aligned bins at compile time. At
load time, each bin is mapped to a random address. The code in each
bin is immutable and thus shared between processes.

To implement pagerando, the compiler and linker need to build shared
libraries with text segments split into page-aligned (and ideally
page-sized) bins. All inter-bin references are indirected through a
table initialized by the dynamic loader that holds the absolute
address of each bin. At load time the loader randomly chooses an
address for each bin and maps the bin pages from disk into memory.

We're focusing on ARM and AArch64 initially, although there is nothing
particularly target specific that precludes support for other LLVM

## Design Goals

1. Improve security over ASLR. The randomization granularity
determines how much information a single code pointer leaks. A pointer
to a page reveals less about the location of other code than a pointer
into a contiguous library would.
2. Avoid randomizing files on disk. Modern operating systems provide
verified boot techniques to detect tampering with files. Randomizing
the on-disk layout of system libraries would interfere with the
trusted boot process. Randomizing libraries at compile or link time
would also needlessly complicate deployment and provisioning.
3. Preserve code page sharing. The OS reduces memory usage by mapping
shared file pages to the same physical memory in each process and
locates these pages at different virtual addresses with ASLR. To
preserve sharing of code pages, we cannot modify the contents of
file-mapped pages at load time and are restricted to changing their
ordering and placement in the virtual address space.
4. Backwards compatibility. Randomized code must interoperate
transparently with existing, unmodified executables and shared
libraries. Calls into randomized code must work as-is according to the
normal ABI.
5. Compatibility with other mitigations. Enabling randomization must
not preclude deploying other mitigations such as control-flow
integrity as well.

## Pagerando Design

Pagerando requires a platform-specific extension to the dynamic
loading ABI for compatible libraries to opt-in to. In order to
decouple the address of each code bin (segment) from that of other
bins and global data, we must disallow relative addressing between
different bin segments as well as between legacy segments and bin

To prepare a library for pagerando, the compiler must first allocate
functions into page-aligned bins corresponding to segments in the
final ELF file. Since these bins will be independently positioned, the
compiler must redirect all inter-bin references through an indirection
table – the Page Offset Table (POT) – which stores the virtual address
of each bin in the library. Indices of POT entries and bin offsets are
statically determined at link time so code will not require any
dynamic relocations to reference functions in another bin or globals
outside of bins. We reserve a register in pagerando-compatible code to
hold the address of the POT. This register is initialized on entry to
the shared library. At load time the dynamic loader maps code bins at
independent, random addresses and updates the dynamic relocations in
the POT.

Reserving a register to hold the POT address changes the internal ABI
calling convention and requires that the POT register be correctly
initialized when entering a library from external code. To initialize
the register, the compiler emits entry wrappers which save the old
contents of the POT register if necessary, initialize the POT
register, and call the target function. Each externally visible
function (conservatively including all address taken functions) needs
an entry wrapper which replaces the function for all external uses.

To optimally pack functions into bins and avoid new static
relocations, we propose using (traditional) LTO. With new static
relocations (i.e. linker cooperation), LTO would not be necessary, but
it is still desirable for more efficient bin packing.

The design of pagerando is based on the mitigations proposed by Backes
and Nürnberger [1], with improvements for compatibility and
deployability. The present design is a refinement of our first
pagerando prototype [2].

## LLVM Changes

To implement pagerando, we propose the following LLVM changes:

New module pass to create entry wrapper functions. This pass will
create entry wrappers as described above and replace exported function
names and all address taken uses with the wrapper. This pass will only
be run when pagerando is enabled.

Instruction Lowering. Pagerando-compatible code must access all global
values (including functions) through the POT since PC-relative memory
addressing is not allowed between a bin and another segment. We
propose that when pagerando is enabled, all global variable accesses
from functions marked as pagerando-compatible must be lowered into
GOT-relative accesses and added to the GOT address loaded from the POT
(currently stored in the first POT entry). Lowering of direct function
calls targeting pagerando-compatible code is slightly more complicated
because we need to determine the POT index of the bin containing the
target function if the target is not in the same bin. However, we
can't properly allocate functions to bins before they are lowered and
an approximate size is available. Therefore, during lowering we should
assume that all function calls must be made indirectly through the POT
with the computation of the POT index and bin offset of the target
function postponed until assembly printing.

New machine module LTO pass to allocate functions into bins. This pass
relies on targets implementing TargetInstrInfo::getInstSizeInBytes
(MachineInstr) so that it knows (approximately) how large the final
function code will be. Functions can also be packed in such a way that
the number of inter-bin calls are minimized by taking the function
call graph and/or execution profiles into account while packing. This
pass only needs to run when pagerando is enabled.

Code Emission. After functions are assigned to bins, we create an
individual MCSection for each bin. These MCSections will map to
independent segments during linking. The AsmPrinter is responsible for
emitting the POT entries during code emission. We cannot easily
represent the POT as a standard IR object because it needs to contain
bin (MCSection) addresses. The AsmPrinter instead can query the
MCContext for the list of bin symbols and emit these symbols directly
into a global POT array.

Gold Plugin Interface. If using LTO to build the module, LLVM can
generate the complete POT for the module and instrument all references
that need to use the POT. However, we must still ensure that bin
sections are each placed into an independent segment so that the
dynamic loader can map each bin separately. The gold plugin interface
currently provides support to assign sections to unique output
segments. However, it does not yet provide plugins an opportunity to
call this interface for new, plugin-created input files. Gold requires
that the plugin provide the file handle of the input section to assign
a section to a unique segment. We will need to upstream a small patch
for gold that provides a new callback to the LTO plugin when gold
receives a new, plugin-generated input file. This would allow the
plugin to obtain the new file’s handle and map its sections to unique
segments. The linker must mark pagerando bin segments in such a way
that the dynamic loader knows that it can randomize each bin segment
independently. We propose a new ELF segment flag PF_RAND_ADDR that can
communicate this for each compatible segment. The compiler and/or
linker must add this flag to compatible segments for the loader to
recognize and randomize the relevant segments.

## Target-Specific Details

We will initially support pagerando for ARM and AArch64, so several
details are worth considering on those targets. For ARM/AArch64, the
r9 register is a platform-specific register that can be used as the
static base register, which is similar in many ways to pagerando. When
not specified by the platform, r9 is a callee-saved general-purpose
register. Thus, using r9 as the POT register will be backwards
compatible when calling out of pagerando code into either legacy code
or a different module; the callee will preserve r9 for use after
returning to pagerando code. In AArch64, r18 is designated as a
platform-specific register, however, it is not specified as
callee-saved when not reserved by the target platform. Thus, to
interoperate with unmodified legacy AArch64 software, we would need to
save r18 in pagerando code before calling into any external code. When
using LTO, the compiler will see the entire module and therefore be
able to identify calls into external vs internal code. Without LTO, it
will likely be more efficient to use a callee-saved register to avoid
the need to save the POT register before each call. We will experiment
with both caller- and callee-saved registers to determine which is
most efficient.

[1] M. Backes and S. Nürnberger. Oxymoron - making fine-grained memory
randomization practical by allowing code sharing. In USENIX Security
Symposium, 2014. https://www.usenix.org/node/184466

[2] S. Crane, A. Homescu, and P. Larsen. Code randomization: Haven’t
we solved this problem yet? In IEEE Cybersecurity Development
Conference (SecDev), 2016.

More information about the llvm-dev mailing list