[PATCH] D119296: KCFI sanitizer

Thu Apr 21 16:37:26 PDT 2022

joaomoreira added a comment.

I played a little bit with kcfi and here are some thoughts:

- under -Os I saw functions being inlined, regardless of the source code calling them indirectly. In these scenarios, the KCFI check was still in place, even though there was not a pointer involved in the call. Although not semantically incorrect, it would be great to prevent the unnecessary overhead (see attached source/compile it with -Os and -fsanitize=kcfi).

F22842586: example.c <https://reviews.llvm.org/F22842586>.

- I noticed that KCFI checks are placed much before the indirect call arguments are properly placed in the passing registers. This makes the position of the check unpredictable. I assume this is a bad thing in case the kernel eventually decides to come up with an approach that uses alternatives CFI schemes through text-patching.
- On the same line as the above comment, and on a complete hypothetical thought, this kind of CFI approach is subject to "Time-of-Check/Time-of-Use" attacks, in the sense that if you have a longer sequence of instructions between the check and the call, the easier it is for an attacker to win the race condition and you are more susceptible to the attack. This shouldn't be a huge concern considering that the function pointer is inside a register, but I think it is reasonable to keep in mind that register contents may go into memory in the occasion of interruptions and such. This is really me being picky (so feel free to disregard), but I think it counts as a reason to keep checks and calls as close as possible.

Because of things like the above, in the past I decided to implement these things in the very backend of the compiler, so other optimizations would not break the layout nor leave dummy checks around. I find it nice to have this implemented as a more architecture general feature, but maybe it would be cool to have a finalization pass in the X86 backend just to tie things. Or maybe you have a better approach on how to prevent these.

================
Comment at: clang/lib/CodeGen/CGExpr.cpp:3168
+      -1);
+  llvm::Value *Test = Builder.CreateICmpEQ(Builder.CreateLoad(HashPtr), Hash);
+  llvm::BasicBlock *ContBB = createBasicBlock("kcfi.cont");
----------------
pcc wrote:
> pcc wrote:
> > samitolvanen wrote:
> > > pcc wrote:
> > > > We considered a scheme like this before and one problem that we discovered with comparing the hash in this way is that it can produce gadgets, e.g.
> > > > ```
> > > > movabs $0x0123456789abcdef, %rax
> > > > cmp %rax, ...
> > > > ```
> > > > the `cmp`instruction ends up being a valid target address because the `movabs` instruction ends in the hash. The way we thought about solving this was to introduce a new intrinsic that would materialize the constant without these gadgets (e.g. invert the `movabs` operand and follow it by a `not`).
> > > Yes, that's a concern with this approach, at least on x86_64. As the hash is more or less random, I assume you'd have to actually check that the inverted form won't have useful gadgets either, and potentially split the single `movabs` into multiple instructions if needed etc. Did you ever start work on the intrinsic or was that just an idea?
> > The likelihood of the inverted operand having gadgets seems equal to that of any other piece of code having gadgets (here I'm just talking about KCFI gadgets, not any other kind of gadget). And if you're using a fixed 2-byte prefix it would be impossible for the `movabs` operand to itself be a gadget. So I don't think it would be necessary to check the inverted operand specifically for gadgets.
> > 
> > You might want to consider selecting the fixed prefix more carefully. It may be worth looking for a prefix that is less likely to appear in generated code (e.g. by taking a histogram of 2-byte sequences in a corpus of libraries) rather than choosing one arbitrarily.
> Also the intrinsic was just an idea, we never implemented it because we ended up going with the currently implemented strategy for the CFI sanitizers.
Please, consider double checking that ENDBR instructions are not accidentally generated as tags.

Also, assuming that there will exist kcfi_unchecked calls in the source code, it would be great to ensure that critical instructions like clac are also not accidentally generated as tags.

================
Comment at: llvm/lib/Target/X86/X86AsmPrinter.cpp:114
+  // Emit two zero bytes to avoid gadgets in llvm.kcfi.check().
+  OutStreamer->emitZeros(2);
+}
----------------
I have seen speculative mitigations using instructions that stop the control-flow (like int3 or hlt) as space fillers just as a redundant/security-in-depth scheme to hold any speculative flow that may have been launched from somewhere else. I think these zeros here can be replaced by 0xcc without harm, right?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119296/new/

https://reviews.llvm.org/D119296