[PATCH] D65761: Add Windows Control Flow Guard checks (/guard:cf).

Thu Sep 5 17:50:53 PDT 2019

rnk added a subscriber: hans.
rnk added a comment.

Here is some feedback, I apologize for dragging my feet.

Also, I would really like to get feedback from @pcc. He's OOO currently, though.

================
Comment at: clang/include/clang/Driver/Options.td:500
 def b : JoinedOrSeparate<["-"], "b">, Flags<[Unsupported]>;
+def cfguard_no_checks : Flag<["-"], "cfguard-no-checks">, Flags<[CC1Option]>,
+  HelpText<"Emit Windows Control Flow Guard tables only (no checks).">;
----------------
@hans, WDYT about the -cc1 interface? I think this is fine.

================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:5972
+    if (Instrument && !NoChecks)
+      //Emit CFG instrumentation and the table of address-taken functions.
       CmdArgs.push_back("-cfguard");
----------------
Comment formatting needs to be fixed, you can use clang-format for this.

================
Comment at: llvm/include/llvm/CodeGen/MachineBasicBlock.h:179
+    assert((!CachedMCSymbol || LabelIsPublic == true) && "Cannot set label to public after it has been accessed.");
+    LabelIsPublic = true; 
+  }
----------------
I hesitate to make MBB labels public, I think it might be better to build your own MCSymbol label with a public name.

================
Comment at: llvm/lib/CodeGen/CFGuardLongjmp.cpp:101
+  // targets.
+  for (MachineInstr *Setjmp : SetjmpCalls) {
+    MachineBasicBlock *OldMBB = Setjmp->getParent();
----------------
Rather than changing the machine CFG, I would suggest using MachineInstr::setPostInstrSymbol, which I've been planning to use for exactly this purpose. :) You can make labels with MCContext::createSymbol, and you'll want to come up with symbols that won't clash with C or C++ symbols. I see MSVC makes labels like `$LN4`, so maybe `$cfgsjN` or something like that. I think MCContext will add numbers for you.

================
Comment at: llvm/lib/Target/X86/X86FixupCFGuard.cpp:11
+/// This file contains the machine instruction transform to fixup the Control
+/// Flow Guard checks inserted by /X86CFGuard.cpp. This pass searches
+/// for such cases and replaces the pair of instructions with a single
----------------
I guess it's not X86CFGuard.cpp now that you've generalized it as a cross-platform IR pass.

================
Comment at: llvm/lib/Target/X86/X86FixupCFGuard.cpp:57-58
+
+/// MachineFunction pass to fix x86 Control Flow Guard checks that have been
+/// generated as separate load and call/invoke instructions.
+class X86FixupCFGuard : public MachineFunctionPass {
----------------
Can you elaborate on what the correctness requirement is? The correctness requirement is just that the loaded value is not saved in a CSR which is then spilled and reloaded from memory across a call site. If possible, I think it would be worth teaching both FastISel and SDISel to build the correct MachineInstrs during instruction selection, and turn this into a validation pass that fails if it sees indirect calls through registers in functions where CFG is enabled.

================
Comment at: llvm/lib/Target/X86/X86FixupCFGuard.cpp:114
+    OldCallOpcode = X86::CALL64r;
+    NewCallOpcode = X86::CALL64m;
+  } else {
----------------
I think you probably also want to look for tail calls. I think the best way to do this would be to use `MI.isCall()` in the loop, and then switch on the opcode to handle all call variants.

================
Comment at: llvm/lib/Target/X86/X86FixupCFGuard.cpp:134
+  // load instruction
+  for (auto I = LoadMI.getIterator(), E = MBB->instr_end(); I != E; ++I) {
+    MachineInstr &MI = cast<MachineInstr>(*I);
----------------
Is it possible to use MachineRegisterInfo to iterate over all the uses of the load, instead of scanning the BB? It seems like this could be O(n^2) if a single BB contains a long sequence of indirect calls. I think it would look like this:
  MachineRegisterInfo *MRI = MF.getRegInfo();
  for (MachineInstr &MI : MRI->use_instructions(Reg))
    // do something with MI
There's also use_operands.

================
Comment at: llvm/lib/Target/X86/X86FixupCFGuard.cpp:168-170
+  // Erase all marked call/invoke instructions
+  for (MachineInstr *MI : MIsToErase) {
+    MI->eraseFromParent();
----------------
Erasing at the end is always the safest way to do it. :)

================
Comment at: llvm/lib/Target/X86/X86FixupCFGuard.cpp:209
+  for (MachineInstr *LoadMI : GuardLoads) {
+    fixupGuardLoad(*LoadMI, TII);
+
----------------
This doesn't do anything with the return value, should it?

================
Comment at: llvm/lib/Transforms/CFGuard/CFGuard.cpp:38
+/// architectures (e.g. X86_32 uses cf_check, X86_64 uses cf_dispatch).
+class CFGuard : public ModulePass {
+public:
----------------
Does this need to be a ModulePass? It's preferable to avoid module passes if possible. When the pass manager has a sequence of function passes to run, it runs all passes on function f, then g, then h, and so on, rather than running the function pass over the entire module and then the next pass over the entire module, and so on. This is done to get better cache locality: each function pass touches f's data structures repeatedly, and then moves on to g without ever going back to f.

You should be able to do the global checks in an override of `doInitialization`, which receives a Module.

================
Comment at: llvm/lib/Transforms/CFGuard/CFGuard.cpp:42
+
+  enum Mechanism { cf_check, cf_dispatch };
+
----------------
I guess more LLVM-style names for these would be CF_Check, CF_Dispatch. Of course, Rui is changing the style soon anyway.

================
Comment at: llvm/lib/Transforms/CFGuard/CFGuard.cpp:220
+    GuardDispatch =
+        B.CreateCall(GuardDispatchType, GuardDispatchLoad, GuardDispatchArgs);
+  } else if (InvokeInst::classof(CB)) {
----------------
You'd want to set the tail call kind here to forward `tail` or `musttail`. This matters for things like member pointer function thunks. This C++ makes a musttail thunk, for example:
  struct A { virtual void f(); };
  auto mp = &A::f;
Which gives this IR:
  define linkonce_odr void @"??_9A@@$BA at AA"(%struct.A* %this, ...) #0 comdat align 2 {
  entry:
    %0 = bitcast %struct.A* %this to void (%struct.A*, ...)***
    %vtable = load void (%struct.A*, ...)**, void (%struct.A*, ...)*** %0, align 8, !tbaa !3
    %1 = load void (%struct.A*, ...)*, void (%struct.A*, ...)** %vtable, align 8
    musttail call void (%struct.A*, ...) %1(%struct.A* %this, ...)
    ret void
  }
This does an indirect virtual call through the 0th slot, and if we lose musttail, it won't perfectly forward arguments.

================
Comment at: llvm/lib/Transforms/CFGuard/CFGuard.cpp:254
+  // register (i.e. RAX on 64-bit X86 targets)
+  GuardDispatch->setCallingConv(CallingConv::CFGuard_Dispatch);
+
----------------
So, this isn't going to work if the original calling convention was something other than the default. For x64, the only commonly used non-standard convention would be vectorcall. Changing the convention here in the IR is going to make it hard to pass that along.

I think as you can see from how much code you have here already, replacing call instructions in general is really hard. These days there are also bundles, which I guess is something else missing from the above.

Here's a sketch of an alternative approach:
- load __guard_dispatch_icall_fptr as normal
- get the pre-existing function type of the callee
- cast the loaded pointer to the original function pointer type
- use `CB->setCalledOperand` to replace the call target
- add the original call target as an "argument bundle" to the existing instruction
- change SelectionDAGBuilder.cpp to recognize the new bundle kind and create a new ArgListEntry in the argument list
- make and set a new bit in ArgListEntry, something like isCFGTarget
- change CC_X86_Win64_C to put CFGTarget arguments in RAX, similar to how CCIfNest puts things in R10

This way, the original call instruction remains with exactly the same arrangement of args, attributes, callbr successors, tail call kind, etc. All you're doing is changing the callee, and passing the original callee as an extra argument on the side. Basically, this approach leverages bundles to avoid messing with the function type, which more or less requires replacing the call. :)

There are probably issues with this design that I haven't foreseen, but I think it's worth a try.

================
Comment at: llvm/test/CodeGen/AArch64/cfguard-checks.ll:3-4
+
+; This test is disabled on Linux
+; UNSUPPORTED: linux
+
----------------
These annotations shouldn't be necessary, the tests should pass as written on Linux. The mtriple flag above tells llc to generate code for Windows.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65761/new/

https://reviews.llvm.org/D65761