[PATCH] D119296: KCFI sanitizer

Thu Jun 2 14:41:39 PDT 2022

joaomoreira added inline comments.

================
Comment at: llvm/lib/Target/X86/X86ISelLowering.h:83
+    KCFI_NT_CALL,
+    KCFI_TC_RETURN,
+
----------------
samitolvanen wrote:
> joaomoreira wrote:
> > I did not revise the entire patch yet. With this said, IMHO, this looks like an overcomplication of a simple problem. Is there a reason why you really need specific KCFI_ nodes instead of only embedding the hash information into an attribute at the Machine Instruction? Then, if hash == 0, it just means it is a call that doesn't need instrumentation.
> > 
> > This latter approach will require less code and should be easier to maintain compatible with other CFI approaches. If the reason is because you don't want to have a useless attribute for non-call instructions, then you could possibly have a map where you bind the call instruction with a respective hash.
> > 
> > Unless there is a strong reason for these, I would much better prefer the slim approach suggested. Either way, if there is a reason for this, I would also suggest that you at least don't name these as "KCFI_something", as in the future others might want to reuse the same structure for other CFI approaches.
> > Is there a reason why you really need specific KCFI_ nodes instead of only embedding the hash information into an attribute at the Machine Instruction?
> 
> This implementation is similar to `CALL_RVMARKER`, `CALL_BTI` and basically all other pseudo call instructions in LLVM. Is adding an attribute to `MachineInstr` the preferred approach instead?
> 
> > I would also suggest that you at least don't name these as "KCFI_something", as in the future others might want to reuse the same structure for other CFI approaches.
> 
> Always happy to hear suggestions for alternative naming. Did you have something in mind?
> This implementation is similar to `CALL_RVMARKER`, `CALL_BTI` and basically all other pseudo call instructions in LLVM. Is adding an attribute to `MachineInstr` the preferred approach instead?

My understanding is that if, every time a new mitigation or optimization comes in, you create a new opcode for it, it will eventually bloat to non-feasibility.

Imagine you have some mitigation like [[ https://www.cs.columbia.edu/~vpk/research/kguard/ | kguard  ]] being implemented. Now you can have calls which are KCFI checked but not KGUARD checked; then KCFI not-checked but KGUARD checked; then KCFI and KGUARD checked.; then none-checked. And then you need all these variations for tail calls (which imho is a first, minor, instance of the problem)...

So, in general, my understanding is that this approach works, yeah, but that in the long term it could become a hassle... so ideally we should use attributes to define these sub-specific instructions instead of opcodes.

> 
> > I would also suggest that you at least don't name these as "KCFI_something", as in the future others might want to reuse the same structure for other CFI approaches.
> 
> Always happy to hear suggestions for alternative naming. Did you have something in mind?

I think switching from KCFI into CFI would already be good enough, as in the end these are all implementing the [[ https://dl.acm.org/doi/10.1145/1102120.1102165 | control-flow integrity ]] concept.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119296/new/

https://reviews.llvm.org/D119296