[LLVMdev] IR extension proposal: bitset constants

Fri Jan 30 11:44:16 PST 2015

Hi Chris,

I wanted to start by giving an explanation of what I am trying to achieve
and how I am trying to achieve it.

I am working towards introducing into LLVM a security mechanism, Forward
Control Flow Integrity (CFI), that is designed to mitigate against
vulnerabilities that allow attacks based on corrupting vtable or function
pointers in memory in order to subvert a program's control flow.

More details are in a paper that I co-authored that was presented at USENIX
Security last year [1]. As mentioned in the paper, attackers are increasingly
relying on such techniques to subvert control flow. This is why I feel that it
is particularly important that compilers contain practical defenses against
such attacks.

One particular variant of the defense I am proposing, vtable verification,
was implemented in GCC and described in section 3 of the paper, however it
comes with a significant performance overhead, more than 8% on certain Chrome
browser-based benchmarks and up to around 20% on SPEC 2006 benchmarks (see
Figure 2). This is likely due to the fact that it searches lists of vtables
to determine if a given vtable is valid. This is a direct consequence of
its avoidance of techniques that depend on changing how the program is linked.

The implementation I am proposing to contribute to LLVM focuses on
performance. Building the needed data structures at link time allows us
to reduce the cost of checking the validity of a vtable pointer to a few
machine instructions and a single load from memory.

On Thu, Jan 29, 2015 at 11:04:41PM -0800, Chris Lattner wrote:
> I don’t think that adding an IR construct for this is the way to go.  You’re making IR complicated for everyone to serve a very narrow use case (one that admittedly I don’t really understand).
> Also, your patch is incomplete.  Presumably you have to scatter checks for “if (!GV->isBitSet())” throughout the optimizer, codegen and other things that touch globals.

I'll address these points simultaneously because I would like to explain
why extensive support for the new construct is not required, and why the
maintenance burden is not as large as it might seem (and why my patch does
not need to make such extensive changes to the optimizers).

The first point I'd like to make is that from the point of view of optimizer
passes other than bitset lowering, the llvm.bitset.test intrinsic has opaque
semantics with regard to the content of the globals it references, so they
cannot legally modify the contents of the bitset global.

The second is that any use of a bitset global other than as an argument to
the llvm.bitset.test intrinsic has undefined semantics. (This is something
that can be documented better.) This means that any optimizer pass that
looks through global initializers does not require any changes, as any
transformation it may perform on IR that treats such globals as regular
globals (for example by taking its address or loading from it) is semantics
preserving, as the semantics of such IR would have been undefined anyway.

With these points in mind, I'm reasonably confident that very little code
needs to care about the new flag.

(I should also point out that I know that the patch most likely works without
any other optimizer changes, because I have a work-in-progress patch that
implements the Clang side of this, and have successfully applied it to a
large C++ codebase and found real bugs in that codebase.)

Regarding codegen, I haven't implemented support in codegen for bitsets yet,
the intrinsic is completely handled by the pass. I can't imagine the changes
being very intrusive though. We can easily add a check that no bitset stuff
makes it through to codegen for the moment.

> The fact that this affects semantics and will only work with LTO and not native linkers is also really weird to me.

I agree, which is why I plan to add support to lld and perhaps other linkers,
but we do have to start somewhere. Adding the functionality to the compiler
only seems like a reasonable first step, even if we depend on LTO to begin
with.

> Is there other precedent for that?  The only cases I know that affect LTO add information that is safe to drop (e.g. TBAA etc).

There is the jumptable attribute, which has been used to implement a variant
of CFI for indirect function calls (see section 4 of the USENIX paper), and
that only works effectively with an LTO'd module. (We might end up adding
native linker support for that or something similar as well.)

Thanks,
-- 
Peter

[1] http://www.pcc.me.uk/~peter/acad/usenix14.pdf