[PATCH] D119296: KCFI sanitizer

Thu Jun 23 18:59:17 PDT 2022

pcc added inline comments.

================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:6694

-  if (isExternallyVisible(T->getLinkage())) {
+  if (isExternallyVisible(T->getLinkage()) || !OnlyExternal) {
     std::string OutName;
----------------
It would be better to have a separate function for computing the KCFI type ids than to try and reuse this one, since you don't need the MDStrings (i.e. unnecessary string uniquing) in your new caller. It also isn't appropriate to pass MetadataIdMap in your new caller because you'll end up with inconsistent values added to MetadataIdMap if we end up calling the other caller (e.g. if both CFI and KCFI are enabled), but that's moot if you avoid it.

================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:2257
+
+  F->setPrefixData(CreateKCFITypeId(FD->getType()));
+  F->addFnAttr("kcfi-target");
----------------
samitolvanen wrote:
> ychen wrote:
> > FYI: using prefix data may not work for the C++ coroutine. (https://github.com/llvm/llvm-project/issues/49689) because corosplit pass may clone coro functions and change its function type. But prefix data is opaque, so there is no way to detect this, and then drop the prefix data in the corosplit pass.
> > 
> > If this is a concern for now or for the near future, it might be better to use alternative approaches like D115844.
> > FYI: using prefix data may not work for the C++ coroutine.
> 
> Thanks for the link, that's interesting. The Linux kernel doesn't use C++ so this isn't a concern there, but I suppose there could theoretically also be C++ users for this feature. @pcc, any thoughts if we should just switch from prefix data to a metadata node here?
I think the issue with coroutines was more about the fact that the function was cloned (invalidating the relative reference in the prefix data) than that the function type was changed. I seem to recall from a discussion with the coroutine folks that the functions created by the coroutine pass with different types aren't actually intended to be called directly from C/C++, so the fact that the types are changed doesn't matter.

I think the main advantage of metadata here would be in doing things like the check for type mismatches in direct calls more easily, so it seems like a reasonable approach even though the rationale is different.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:2453
+  // additional machine instructions being emitted between the check and
+  // the call. This means we don't have to worry about expanding BLR_BTI
+  // and TCRETURNri* pseudos.
----------------
For PAuth ABI the motivation for avoiding the gap between check and call was around avoiding spilling the verified pointer, is this not possible with KCFI?

================
Comment at: llvm/lib/Target/X86/X86AsmPrinter.cpp:119
+  MCSymbol *FnSym = OutContext.getOrCreateSymbol("__cfi_" + MF.getName());
+  // Use the same linkage as the parent function.
+  emitLinkage(&MF.getFunction(), FnSym);
----------------
If this is just about satisfying objtool do these symbols need to be exported or can they just be STB_LOCAL?

================
Comment at: llvm/lib/Target/X86/X86ISelLowering.h:83
+    KCFI_NT_CALL,
+    KCFI_TC_RETURN,
+
----------------
samitolvanen wrote:
> joaomoreira wrote:
> > samitolvanen wrote:
> > > joaomoreira wrote:
> > > > I did not revise the entire patch yet. With this said, IMHO, this looks like an overcomplication of a simple problem. Is there a reason why you really need specific KCFI_ nodes instead of only embedding the hash information into an attribute at the Machine Instruction? Then, if hash == 0, it just means it is a call that doesn't need instrumentation.
> > > > 
> > > > This latter approach will require less code and should be easier to maintain compatible with other CFI approaches. If the reason is because you don't want to have a useless attribute for non-call instructions, then you could possibly have a map where you bind the call instruction with a respective hash.
> > > > 
> > > > Unless there is a strong reason for these, I would much better prefer the slim approach suggested. Either way, if there is a reason for this, I would also suggest that you at least don't name these as "KCFI_something", as in the future others might want to reuse the same structure for other CFI approaches.
> > > > Is there a reason why you really need specific KCFI_ nodes instead of only embedding the hash information into an attribute at the Machine Instruction?
> > > 
> > > This implementation is similar to `CALL_RVMARKER`, `CALL_BTI` and basically all other pseudo call instructions in LLVM. Is adding an attribute to `MachineInstr` the preferred approach instead?
> > > 
> > > > I would also suggest that you at least don't name these as "KCFI_something", as in the future others might want to reuse the same structure for other CFI approaches.
> > > 
> > > Always happy to hear suggestions for alternative naming. Did you have something in mind?
> > > This implementation is similar to `CALL_RVMARKER`, `CALL_BTI` and basically all other pseudo call instructions in LLVM. Is adding an attribute to `MachineInstr` the preferred approach instead?
> > 
> > My understanding is that if, every time a new mitigation or optimization comes in, you create a new opcode for it, it will eventually bloat to non-feasibility.
> > 
> > Imagine you have some mitigation like [[ https://www.cs.columbia.edu/~vpk/research/kguard/ | kguard  ]] being implemented. Now you can have calls which are KCFI checked but not KGUARD checked; then KCFI not-checked but KGUARD checked; then KCFI and KGUARD checked.; then none-checked. And then you need all these variations for tail calls (which imho is a first, minor, instance of the problem)...
> > 
> > So, in general, my understanding is that this approach works, yeah, but that in the long term it could become a hassle... so ideally we should use attributes to define these sub-specific instructions instead of opcodes.
> > 
> > > 
> > > > I would also suggest that you at least don't name these as "KCFI_something", as in the future others might want to reuse the same structure for other CFI approaches.
> > > 
> > > Always happy to hear suggestions for alternative naming. Did you have something in mind?
> > 
> > I think switching from KCFI into CFI would already be good enough, as in the end these are all implementing the [[ https://dl.acm.org/doi/10.1145/1102120.1102165 | control-flow integrity ]] concept.
> > My understanding is that if, every time a new mitigation or optimization comes in, you create a new opcode for it, it will eventually bloat to non-feasibility.
> 
> I do agree, if there are enough call variants that can be combined, this will quickly get messy. So far this probably just hasn't been an issue. I'm not particularly happy about the multiple pseudo instructions here either.
> 
> @pcc, any thoughts about the best way to make this cleaner? Should we switch to a `MachineInstr` attribute, or perhaps pass another operand with the specific call type, or something else?
> 
> > so ideally we should use attributes to define these sub-specific instructions instead of opcodes.
> 
> One concern I have with using an attribute is that we have to ensure that no pass accidentally drops it while transforming the call instruction or replacing it with something else. That's not an issue if we have a separate pseudo instruction with the type as an operand. Did you run into any issues with this?
> 
> Also, how do you serialize the type attribute in MIR? Something similar to `debug-instr-number`?
> 
> > I think switching from KCFI into CFI would already be good enough
> 
> Sure, sounds good to me. I'll change the naming once we have a consensus on the correct approach here.
Would it make sense to add a field to `MachineInstr::ExtraInfo` to hold the additional information here? That way you can avoid increasing the size of `MachineInstr`.

================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:3093
+            FunctionType->getZExtValue() != ExpectedType->getZExtValue())
+          dbgs() << Call.getModule()->getName() << ":"
+                 << Call.getDebugLoc().getLine()
----------------
With CFI, if this situation occurs we unconditionally crash, the rationale being that if we end up in this situation it is a situation unexpected by the developer and may even be intended to be unreachable at runtime, so it's best to crash if it does occur. The same approach is used for things like UBSan integer overflow checks if the optimizer figures out that the overflow will always happen. If I'm understanding the code correctly KCFI will just let the call occur in this case and I'm not sure if that's a good idea.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119296/new/

https://reviews.llvm.org/D119296