[LLVMdev] [RFC] Simple control-flow integrity

Mon Feb 10 17:19:33 PST 2014

>     1. creates a power-of-two sized InlineAsm jump table (or multiple
> jump tables) filled with jump instructions to each address-taken
> function.
>

Why inline asm? There's probably a better way to do this via lowering
your jump table in the backend etc.

-eric

>     2. replaces each such address-taken function with a pointer to the
> corresponding location in the appropriate table. Note that these will
> be valid function pointers for the purposes of external code.
>
>     3. adds a fast check for pointer safety at each indirect call site:
>
>          a. It forces the pointer into the appropriate table (based on
> type information), and checks to see if the pointer changed. Pointers
> that were already in the right table will not change, and all other
> pointers will. We rewrite the pointer either by masking and adding to
> a base pointer, if we can guarantee sufficient table alignment,
> otherwise by subtracting from a base, then masking, then adding back
> to the base.
>
>          b. If the pointer fails the check, it's passed to a CFI
> failure function defined at compile time to handle it. By default, we
> define a function written in IR; this function prints out the name of
> the function in which the CFI violation happens.
>
>
>
>
> The biggest challenge for such an implementation is functions that are
> neither declared nor defined at LTO time. These functions are false
> positives for the CFI check. They can occur in at least 3 ways:
>
>     - JIT code, like in the v8 javascript engine, can allocate and
> call functions that were not defined at compile time. These functions
> are not even external: they just didn't exist at LTO time.
>
>     - External functions can return pointers to external functions
> that were not exposed at LTO time. The canonical example in this class
> is dlsym, which is used extensively by many projects. Other commonly
> used cases are signal/sigaction (returns the old signal handler),
> XSetErrorHandler from X, and std::set_new_handler from the Standard
> C++ library. But this happens with any dynamically-linked library that
> has a method that returns function pointers.
>
>     - Internal code that takes function pointer arguments can be
> passed to external code and have external function pointers passed to
> it as arguments. This pattern is used extensively by graphics
> libraries, e.g., gtk.
>
>
>
>
> I have some techniques that help handle these false positives:
>
>     - Since CFI violations are passed to an arbitrary function, the
> policy for these violations can be set at compile time. For example,
> you could run the rewritten code for a while to build up a set of
> known false positives, then switch to a CFI failure function that
> stopped when it saw something not allowed by the policy. This is
> similar to the approach taken by, e.g., AppArmor.
>
>     - my current CFI pass looks for special annotations added to the
> source code: these are of the form
> __attribute__((annotate("cfi-maybe-external"))) and
> __attribute__((annotate("cfi-no-rewrite")))
>
>          - cfi-maybe-external can be applied to pointers and variables
> (llvm.ptr.annotation and llvm.var.annotation) and means that this
> value sometimes stores external function pointers.
>
>          - cfi-no-rewrite is applied to functions and means that there
> are indirect calls in this function that can happen with external
> function pointers. The current implementation skips rewriting for
> these functions, but it could instead be used to prepopulate a list of
> known potential false positives.
>
>     - I have a separate analysis pass called ExternalFunctionAnalysis
> that does a fairly naive interprocedural dataflow analysis starting
> from cfi-maybe-external annotations and from all places where it can
> find external function pointers coming in to the module:
>
>           - if an external function pointer flows into a store that
> doesn't flow from an annotated location, then the pass prints a
> warning
>
>           - all indirect call sites that flow from annotated
> pointers/variables are not rewritten (but this could be used instead
> to prepopulate a whitelist of known false positives instead).
>
>
>
> As I mentioned, I've used my current implementation to build a version
> of Chromium protected with this form of CFI; in the process, I added
> sufficient annotations to the Chromium code base to catch all false
> positives (or at least: I haven't seen any in my testing so far). I've
> also tried it out with other, less immense, projects, like the SPEC
> CPU2006 benchmark suite.
>
> Please let me know what you think.
>
> Thanks,
>
> Tom
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev