<div dir="auto"><div><br><div class="gmail_extra"><br><div class="gmail_quote">On Jan 26, 2018 10:10 AM, "Paul Rouschal" <<a href="mailto:prouschal@a-bix.com" target="_blank">prouschal@a-bix.com</a>> wrote:<br type="attribution"><blockquote class="m_1539297806344428808quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<div class="m_1539297806344428808quoted-text"><br>

<br>

Sean Silva via llvm-dev wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Wouldn't a branch funnel open the door to a type 1 attack?<br>

</blockquote>

<br>

<br></div>

Only if the code looks exactly as you wrote it. If I understand this correctly the problem with indirect branches is that the "gadget", the code leaking the data, could be *anywhere* in the binary, giving the attacker much more freedom. So restricting these calls to one of the known correct results will still be a (relative) win.<br></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">You're right, it does mitigate the ROP aspect.</div><div dir="auto"><br></div><div dir="auto">-- Sean Silva</div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="m_1539297806344428808quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

<br>

Hand-Wavy Example:<div class="m_1539297806344428808quoted-text"><br>

<br>

struct Base {<br>

      virtual int f(long) = 0;<br>

};<br>

<br>

struct A : Base {<br>

     int f(long x) override {<br>

          return 0;<br>

     };<br>

};<br>

<br>

struct B : Base {<br>

     int f(long x) override {<br></div>

          return 1;<br>

     };<br>

};<br>

<br>

static int aCompletelyUnrelatedFunction() {<br>

    someOtherCode();<br>

Gadget:<br>

    int z = array2[array1[somethingInTheSa<wbr>meRegisterAsX] * 256];<br>

    return z;<br>

}<br>

<br>

Here the attacker could train the predictor to continue execution at "Gadget".<br>

<br>

To quote from [1]<br>

<br>

"To mistrain the BTB, the attacker finds the virtual ad-<br>

dress of the gadget  in the victim’s address space,  then<br>

performs indirect branches to this address.  This training<br>

is done from the attacker’s address space, and it does not<br>

matter what resides at the gadget address in the attacker’s<br>

address space; all that is required is that the branch used<br>

for training branches to use the same destination virtual<br>

address."<br>

<br>

[1] Kocher <a href="http://et.al" rel="noreferrer" target="_blank">et.al</a>.: Spectre Attacks: Exploiting Speculative Execution <a href="https://spectreattack.com/spectre.pdf" rel="noreferrer" target="_blank">https://spectreattack.com/spec<wbr>tre.pdf</a><br>

<br>

<br>

Best Regards,<br>

Paul<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="m_1539297806344428808quoted-text">

E.g. if the code looks like this, then a branch funnel basically turns into a standard type 1 pattern AFAICT:<br>

<br>

struct Base {<br>

     virtual int f(long) = 0;<br>

};<br>

<br>

struct A : Base {<br>

     int f(long x) override {<br>

         return 0;<br>

     };<br>

};<br>

<br>

struct B : Base {<br>

     int f(long x) override {<br>

         // As in listing 1 in <a href="https://spectreattack.com/spectre.pdf" rel="noreferrer" target="_blank">https://spectreattack.com/spec<wbr>tre.pdf</a><br>

         return array2[array1[x] * 256];<br>

     }<br>

};<br>

<br>

-- Sean Silva<br>

<br></div><div class="m_1539297806344428808quoted-text">

On Tue, Jan 23, 2018 at 4:44 PM, Peter Collingbourne via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.or<wbr>g</a>>> wrote:<br>

<br>

    The proposed mitigation for variant 2 of CVE-2017-5715, “branch<br>

    target injection”, is to send all indirect branches through an<br>

    instruction sequence known as a retpoline. Because the purpose of a<br>

    retpoline is to prevent attacker-controlled speculation, we also end<br>

    up losing the benefits of benign speculation, which can lead to a<br>

    measurable loss of performance.<br>

<br>

    We can regain some of those benefits if we know that the set of<br>

    possible branch targets is fixed (this is sometimes known to be the<br>

    case when using whole-program devirtualization or CFI -- see<br>

    <a href="https://clang.llvm.org/docs/LTOVisibility.html" rel="noreferrer" target="_blank">https://clang.llvm.org/docs/LT<wbr>OVisibility.html</a><br></div>

    <<a href="https://clang.llvm.org/docs/LTOVisibility.html" rel="noreferrer" target="_blank">https://clang.llvm.org/docs/L<wbr>TOVisibility.html</a>>). In that case, we<div class="m_1539297806344428808elided-text"><br>

    can construct a so-called “branch funnel” that selects one of the<br>

    possible targets by performing a binary search on an address<br>

    associated with the indirect branch (for virtual calls, this is the<br>

    address of the vtable, and for indirect calls via a function<br>

    pointer, this is the function pointer itself), eventually directly<br>

    branching to the selected target. As a result, the processor is free<br>

    to speculatively execute the virtual call, but it can only<br>

    speculatively branch to addresses of valid implementations of the<br>

    virtual function, as opposed to arbitrary addresses.<br>

<br>

    For example, suppose that we have the following class hierarchy,<br>

    which is known to be closed:<br>

<br>

    struct Base { virtual void f() = 0; };<br>

    struct A : Base { virtual void f(); };<br>

    struct B : Base { virtual void f(); };<br>

    struct C : Base { virtual void f(); };<br>

<br>

    We can lay out the vtables for the derived classes in the order A,<br>

    B, C, and produce an instruction sequence that directs execution to<br>

    one of the targets A::f, B::f and C::f depending on the vtable<br>

    address. In x86_64 assembly, a branch funnel would look like this:<br>

<br>

    lea B::vtable+16(%rip), %r11<br>

    cmp %r11, %r10<br>

    jb A::f<br>

    je B::f<br>

    jmp C::f<br>

<br>

    A caller performs a virtual call by loading the vtable address into<br>

    register r10, setting up the other registers for the virtual call<br>

    and directly calling the branch funnel as if it were a regular<br>

    function. Because the branch funnel enforces control flow integrity<br>

    by itself, we can also avoid emitting CFI checks at call sites that<br>

    use branch funnels when CFI is enabled.<br>

<br>

    To control the layout of vtables and function pointers, we can<br>

    extend existing mechanisms for controlling layout that are used to<br>

    implement CFI (see<br>

    <a href="https://clang.llvm.org/docs/ControlFlowIntegrityDesign.html" rel="noreferrer" target="_blank">https://clang.llvm.org/docs/Co<wbr>ntrolFlowIntegrityDesign.html</a><br></div>

    <<a href="https://clang.llvm.org/docs/ControlFlowIntegrityDesign.html" rel="noreferrer" target="_blank">https://clang.llvm.org/docs/C<wbr>ontrolFlowIntegrityDesign.html</a><wbr>>) so<div class="m_1539297806344428808elided-text"><br>

    that they are also used whenever a branch funnel needs to be created.<br>

<br>

    The compiler will only use branch funnels when both the retpoline<br>

    mitigation (-mretpoline) and whole-program devirtualization<br>

    (-fwhole-program-vtables) features are enabled (the former is on the<br>

    assumption that in general a regular indirect call will be less<br>

    expensive than a branch funnel, and the latter provides the<br>

    necessary guarantee that the type hierarchy is closed). Even when<br>

    retpolines are enabled, there is still a cost associated with<br>

    executing a branch funnel that needs to be balanced against the cost<br>

    of a regular CFI check and retpoline, so branch funnels are only<br>

    used when there are <=10 targets (this number has not been tuned<br>

    yet). Because the implementation uses some of the same mechanisms<br>

    that are used to implement CFI and whole-program devirtualization,<br>

    it requires LTO (it is compatible with both full LTO and ThinLTO).<br>

<br>

    To measure the performance impact of branch funnels, I ran a<br>

    selection of Chrome benchmark suites on Chrome binaries built with<br>

    CFI, CFI + retpoline and CFI + retpoline + branch funnels, and<br>

    measured the median impact over all benchmarks in each suite. The<br>

    numbers are presented below. I should preface these numbers by<br>

    saying that these are largely microbenchmarks, so the impact of<br>

    retpoline on its own is unlikely to be characteristic of real<br>

    workloads. The numbers to focus on should be the impact of retpoline<br>

    + branch funnels relative to the impact of retpoline, where there is<br>

    a median 5.7% regression as compared to the median 8% regression<br>

    associated with retpoline.<br>

<br>

    Benchmark suite<br>

<br>

        <br>

<br>

    CFI + retpoline impact<br>

<br>

    (relative to CFI)<br>

<br>

        <br>

<br>

    CFI + retpoline + BF impact<br>

<br>

    (relative to CFI)<br>

<br>

    blink_perf.bindings<br>

<br>

        <br>

<br>

    0.9% improvement<br>

<br>

        <br>

<br>

    9.8% improvement<br>

<br>

    blink_perf.dom<br>

<br>

        <br>

<br>

    20.4% regression<br>

<br>

        <br>

<br>

    17.5% regression<br>

<br>

    blink_perf.layout<br>

<br>

        <br>

<br>

    17.4% regression<br>

<br>

        <br>

<br>

    14.3% regression<br>

<br>

    blink_perf.parser<br>

<br>

        <br>

<br>

    3.8% regression<br>

<br>

        <br>

<br>

    5.7% regression<br>

<br>

    blink_perf.svg<br>

<br>

        <br>

<br>

    8.0% regression<br>

<br>

        <br>

<br>

    5.4% regression<br>

<br>

<br>

      Future work<br>

<br>

    Implementation of branch funnels for architectures other than x86_64.<br>

<br>

    Implementation of branch funnels for indirect calls via a function<br>

    pointer (currently only implemented for virtual calls). This will<br>

    probably require an implementation of whole-program<br>

    “devirtualization” for indirect calls.<br>

<br>

    Use profile data to order the comparisons in the branch funnel by<br>

    frequency, to minimise the number of comparisons required for<br>

    frequent virtual calls.<br>

<br>

    Thanks,<br>

    --     --     Peter<br>

<br>

    ______________________________<wbr>_________________<br>

    LLVM Developers mailing list<br></div>

    <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.or<wbr>g</a>><br>

    <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

    <<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin<wbr>/mailman/listinfo/llvm-dev</a>><div class="m_1539297806344428808quoted-text"><br>

<br>

<br>

<br>

<br>

______________________________<wbr>_________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

<br>

</div></blockquote>

</blockquote></div><br></div></div></div>