[PATCH] D119296: KCFI sanitizer

Sami Tolvanen via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jun 2 15:59:10 PDT 2022


samitolvanen added inline comments.


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.h:83
+    KCFI_NT_CALL,
+    KCFI_TC_RETURN,
+
----------------
joaomoreira wrote:
> samitolvanen wrote:
> > joaomoreira wrote:
> > > I did not revise the entire patch yet. With this said, IMHO, this looks like an overcomplication of a simple problem. Is there a reason why you really need specific KCFI_ nodes instead of only embedding the hash information into an attribute at the Machine Instruction? Then, if hash == 0, it just means it is a call that doesn't need instrumentation.
> > > 
> > > This latter approach will require less code and should be easier to maintain compatible with other CFI approaches. If the reason is because you don't want to have a useless attribute for non-call instructions, then you could possibly have a map where you bind the call instruction with a respective hash.
> > > 
> > > Unless there is a strong reason for these, I would much better prefer the slim approach suggested. Either way, if there is a reason for this, I would also suggest that you at least don't name these as "KCFI_something", as in the future others might want to reuse the same structure for other CFI approaches.
> > > Is there a reason why you really need specific KCFI_ nodes instead of only embedding the hash information into an attribute at the Machine Instruction?
> > 
> > This implementation is similar to `CALL_RVMARKER`, `CALL_BTI` and basically all other pseudo call instructions in LLVM. Is adding an attribute to `MachineInstr` the preferred approach instead?
> > 
> > > I would also suggest that you at least don't name these as "KCFI_something", as in the future others might want to reuse the same structure for other CFI approaches.
> > 
> > Always happy to hear suggestions for alternative naming. Did you have something in mind?
> > This implementation is similar to `CALL_RVMARKER`, `CALL_BTI` and basically all other pseudo call instructions in LLVM. Is adding an attribute to `MachineInstr` the preferred approach instead?
> 
> My understanding is that if, every time a new mitigation or optimization comes in, you create a new opcode for it, it will eventually bloat to non-feasibility.
> 
> Imagine you have some mitigation like [[ https://www.cs.columbia.edu/~vpk/research/kguard/ | kguard  ]] being implemented. Now you can have calls which are KCFI checked but not KGUARD checked; then KCFI not-checked but KGUARD checked; then KCFI and KGUARD checked.; then none-checked. And then you need all these variations for tail calls (which imho is a first, minor, instance of the problem)...
> 
> So, in general, my understanding is that this approach works, yeah, but that in the long term it could become a hassle... so ideally we should use attributes to define these sub-specific instructions instead of opcodes.
> 
> > 
> > > I would also suggest that you at least don't name these as "KCFI_something", as in the future others might want to reuse the same structure for other CFI approaches.
> > 
> > Always happy to hear suggestions for alternative naming. Did you have something in mind?
> 
> I think switching from KCFI into CFI would already be good enough, as in the end these are all implementing the [[ https://dl.acm.org/doi/10.1145/1102120.1102165 | control-flow integrity ]] concept.
> My understanding is that if, every time a new mitigation or optimization comes in, you create a new opcode for it, it will eventually bloat to non-feasibility.

I do agree, if there are enough call variants that can be combined, this will quickly get messy. So far this probably just hasn't been an issue. I'm not particularly happy about the multiple pseudo instructions here either.

@pcc, any thoughts about the best way to make this cleaner? Should we switch to a `MachineInstr` attribute, or perhaps pass another operand with the specific call type, or something else?

> so ideally we should use attributes to define these sub-specific instructions instead of opcodes.

One concern I have with using an attribute is that we have to ensure that no pass accidentally drops it while transforming the call instruction or replacing it with something else. That's not an issue if we have a separate pseudo instruction with the type as an operand. Did you run into any issues with this?

Also, how do you serialize the type attribute in MIR? Something similar to `debug-instr-number`?

> I think switching from KCFI into CFI would already be good enough

Sure, sounds good to me. I'll change the naming once we have a consensus on the correct approach here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119296/new/

https://reviews.llvm.org/D119296



More information about the cfe-commits mailing list