[PATCH] D41761: Introduce llvm.nospeculateload intrinsic

Tue Jan 30 08:25:47 PST 2018

kristof.beyls added a comment.

In https://reviews.llvm.org/D41761#989799, @efriedma wrote:

> > I have no idea what form that assurance would take, since I don't know how LangRef handles such matters.
>
> Well, I don't really know either; LangRef only describes the abstract virtual machine, mostly.  That's part of the problem. :)
>
> > How can two variant-1 attacks be "different" enough that a speculationsafeload would protect against one but not the other, when both exploit the same load operation
>
> Sorry, wasn't quite clear.  There are two speculated loads for a variant-1 attack: the load that reads the secret, and the load that leaks the secret to the user.  The first load has to be speculation-safe to stop the attack; whether the second is speculation-safe is irrelevant, at least in the proposals so far.  That isn't really a fundamental problem, just an illustration that reasoning about these attacks is tricky.
>
> > I don't see how the code being spread over multiple functions matters- all that matters are the load, and the branch (or nested branches) that actually guard that load
>
> Well, the CPU doesn't care (assuming it can perfectly predict calls), but it's problematic from an auditing perspective because it's harder to spot.  Particularly since the vulnerable code might not explicitly reference any user-controlled data at all.
>
> > If the application logic doesn't explicitly prevent any of the loads the attacker is exploiting
>
> A C programmer cannot reasonably come up with a complete list of all the potentially exploitable loads without the compiler being aware that the user needs Spectre-resistant code.  There are two key problems:
>
> 1. An exploitable load might not exist in the original program.  One example is the switch-to-lookup-table transform.  Given:
>
>   ``` int a(unsigned x) { switch (x) { case 0: return 2; case 1: return 44; case 2: return 23; default: return 8; } } ```
>
>   We transform to: ``` int a(int x) { if (x > 2) return 8; const static int table[] = {2, 44, 23}; return table[x]; } ```
>
>   Now you have a speculated out-of-bounds load from code which didn't contain any loads.
> 2. The exploitable code might come out of some non-obvious lowering.  Even if a pointer points to something "obviously" safe, it might actually be uninitialized along some impossible path through the function.  Probably the easiest example to explain is polly's invariant load hoisting.  Basically, if you have a loop like this:
>
>   ``` int sum = 0; for (int i = 0; i < n; ++i) { if (b) sum += (*p)[i]; } ```
>
>   It gets transformed to something like this:
>
>   ``` int *pp; if (b) pp = *p; int sum = 0; for (int i = 0; i < n; ++i) { if (b) sum += pp[i]; } ```
>
>   So now you have the if() if() pattern which leads to a speculated load from an uninitialized pointer.

Thanks for the clear examples, Eli.
I wonder if anyone on this thread already has thought about whether it would be practically possible to make those transformations not introduce such a pattern, e.g. under a specific code generation option? Or if a transformation introduces such a pattern, whether it would be feasible for it to make use of whatever the intrinsic is we end up with?

https://reviews.llvm.org/D41761