[PATCH] D76158: Add inline assembly load hardening mitigation for Load Value Injection (LVI) on X86 [6/6]

Tue Apr 7 03:12:50 PDT 2020

mattdr added inline comments.

================
Comment at: llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp:3167
+  switch (Inst.getOpcode()) {
+  case X86::RET:
+  case X86::RETL:
----------------
craig.topper wrote:
> sconstab wrote:
> > mattdr wrote:
> > > Following up on my previous comment: it seems like this is supposed to be the full set of instructions that combine a load from memory with a control flow change dependent on that load.
> > > 
> > > How are we sure the list is complete? Just from a quick look at the [[ https://github.com/llvm/llvm-project/blob/1c545f6dbcbb3ada2dfef2c6afbc1ca8939135cb/llvm/lib/Target/X86/X86InstrControl.td#L23-L57 | return instructions LLVM has for X86 ]], it looks like `LRETL` and `LRETQ` exist and would not be mitigated.
> > > 
> > > Can we just look at the `MCInstDesc` and warn for instructions that `mayLoad` and `mayAffectControlFlow`? Then this list will stay up to date even if new instructions are added.
> > > 
> > I looked into this and unfortunately the mayLoad attribute is not set for any of the RET family (see https://github.com/llvm/llvm-project/blob/b1d581019f5d4d176ad70ddffee13db247a13ef1/llvm/lib/Target/X86/X86InstrControl.td#L21). Either this is a bug or I don't actually understand what the mayLoad attribute actually means. It seems pretty strange since the POPr family has mayLoad. @craig.topper : opinion?
> I suspect it doesn't have the mayLoad flag because from the perspective of the codegen passes that normally use that flag, the load of the return address isn't really something we would care about. It could just as well be reading the return address from some stack dedicated return address stack or a stack implemented in hardware or something.
> 
> Not sure what the implications of just adding the flag would be to codegen.
Yikes, okay. It's incredibly surprising that `mayLoad` isn't set for the `RET` family, and if anything stops working when we set it that's almost certainly a bug we want to know about.

But I can accept that fixing that is out of scope for this change -- at least, for the initial version. If so, though, that makes it all the more important that we describe how we arrived at this particular list of instructions, in enough detail that someone else can retrace our steps.

================
Comment at: llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp:3186
+  case X86::FARCALL64:
+    Warning(Inst.getLoc(), "Instruction may be vulnerable to LVI and "
+                           "requires manual mitigation");
----------------
sconstab wrote:
> mattdr wrote:
> > This warning should at least tell users *why* this is vulnerable, and ideally offer the next step for fixing it.
> The manual mitigation is a little bit more involved than what would typically be conveyed in a compiler warning. Would it be acceptable for the warning to point to the LVI deep dive document?
Sure, a link to an external site seems fine. Can we please link to the specific part of the page that explains the mitigations for this specific set of instructions?

================
Comment at: llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp:3204
+  auto Flags = Inst.getFlags();
+  if ((Flags & X86::REP_PREFIX) || (Flags & X86::REPNE_PREFIX)) {
+    switch (Opcode) {
----------------
craig.topper wrote:
> I don't think this works if the rep prefix is on the line before the cmps/scas instruction. That will assemble to the same thing but the parser sees it as two instructions.
@craig.topper any notion what the best workaround would be? One option is keeping track of whether we've seen a `REP` or `REPNE` before this.

Although perhaps a simpler approach is just not worrying about whether we saw `REP`. We're here to try to warn users about instructions that we can't mitigate. What's the likelihood that we see a `CMPSB` without `REP` and it turns out it _doesn't_ need mitigation?

================
Comment at: llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp:3206
+    switch (Opcode) {
+    case X86::CMPSB:
+    case X86::CMPSW:
----------------
mattdr wrote:
> I didn't find any mention of these instructions, why they would "require additional considerations", or what folks should actually *do* if they hit this warning by reading https://software.intel.com/security-software-guidance/insights/deep-dive-load-value-injection.
> 
> Here's my best guess:
> This is a list of [[ https://cpu.fyi/d/484#G7.340223 | string instructions that can be used with `REP` ]] and which update `ZF` in `EFLAGS` instead of relying on an a count in `rcx` to terminate". When used with `REP` or `REPNE`, these instructions become interesting because an attacker could use LVI to change the target's speculative control flow by injecting a value for a load. Since the instruction is indivisible, there's no place to put an `LFENCE`, so like `RET` these instructions need to be split into load and compare/branch components.
> 
> Is that accurate?
> 
> If so, we could explain it that way and replace the hardcoded list with `hasImplicitDefOfPhysReg(X86::EFLAGS)`
And if we hit the same problem as above with `RET`, where somehow the instruction tables are inaccurate and we can't rely on them yet, then it's similarly okay with me if we leave the hardcoded list of instructions as long as we provide really specific comments describing why the list is all and only these instructions.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76158/new/

https://reviews.llvm.org/D76158