[PATCH] D75937: Add Support to X86 for Load Hardening to Mitigate Load Value Injection (LVI) [5/6]

Mon Apr 27 13:28:39 PDT 2020

sconstab added inline comments.

================
Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:643
+          for (const auto &EE : N.edges()) {
+            if (MachineGadgetGraph::isCFGEdge(EE)) {
+              if (!CheapestSoFar || EE.getValue() < CheapestSoFar->getValue())
----------------
mattdr wrote:
> sconstab wrote:
> > mattdr wrote:
> > > Every edge touched by this loop will //also// be touched by the other leg of this loop at one point or another ("E is a CFG edge"). Can we find a way to avoid the `O(E^2)` inner loop?
> > > 
> > > For example: what if we made the outermost loop edge-major? Keep sets of mitigated source-nodes and sink-nodes. Create a list of all edges by weight, sort it, then go through the edges from lowest to highest weight and use them to try to mitigate a new source or sink node.
> > > 
> > > There's a good chance I'm missing something fundamental there, and I thank you in advance for your patience explaining it to me. But if I can I'd really like to avoid repeating the `CheapestSoFar` comparison on edge weights in two places.
> > Yes... in hindsight this whole algorithm was more than a bit sloppy on my part. I completely revamped the algorithm and I think that now it is O(N + E). Please check and make sure you agree.
> Each iteration loops over all edges and removes exactly one... so this is probably still `O(E^2)`.
> 
> Seems like we can get it down closer to `O(E * lg E)` if we:
> 1. Compute `GadgetSources` and `GadgetSinks` once
> 2. Put the edges into a vector, then sorting them by weight
> 3. Iterate through that vector and, for each edge, add it as an edge to cut if it is still relevant (i.e. not yet otherwise mitigated)
> 
> The insight here is that edge weights don't change, so mitigating an edge doesn't change the ranking of other edges.
> 
> That said, I'm happy with the readability of the current implementation and would be satisfied for now if we just add `//FIXME: this is O(E^2), it could probably be O(E * lg E) with some work`
I think your description actually implies that this greedy heuristic cannot be less that `O(E^2)`, right?
> Iterate through that vector and, for each edge, add it as an edge to cut if it is still relevant (i.e. not yet otherwise mitigated)
The "if it is still relevant" is an `O(E)` operation, so the whole thing should be `O(E^2)`.

Regardless, I did add a comment to this effect.

================
Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:534
+// Returns the number of remaining gadgets after edge elimination
+int X86LoadValueInjectionLoadHardeningPass::elimEdges(
+    MachineGadgetGraph &G, MachineGadgetGraph::EdgeSet &ElimEdges /* in, out */,
----------------
mattdr wrote:
> This is a lot clearer now, thanks!
> 
> Given what this is doing I think it makes sense to call it `findMitigatedEdges`.
I changed it to `elimMitigatedEdgesAndNodes()`

================
Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:541
+      if (isFence(Dest->getValue())) {
+        ElimNodes.insert(*Dest);
+        ElimEdges.insert(E);
----------------
mattdr wrote:
> How are we sure we've removed all edges that pointed to this same destination node?
I think it does? This loop iterates through ALL edges. For each CFG edge that ingresses a fence, add that fence to `ElimNodes`, and add that edge and all egress CFG edges to `ElimEdges`. The loop does *not* skip over edges that have been added to `ElimEdges`, and therefore I think this should ensure that all CFG edges pointing to a given fence are removed.

================
Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:553
+  int MitigatedGadgets = 0, RemainingGadgets = 0;
+  MachineGadgetGraph::NodeSet Visited{G};
+  for (const auto &RootN : G.nodes()) {
----------------
mattdr wrote:
> To make it easier to follow, let's call this something like `ReachableGadgetSinkNodes`
But that isn't accurate.. The algorithm finds all reachable nodes, not just gadget sinks. I renamed to `ReachableNodes`.

================
Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:555-556
+  for (const auto &RootN : G.nodes()) {
+    if (llvm::none_of(RootN.edges(), MachineGadgetGraph::isGadgetEdge))
+      continue; // skip this node if it isn't a gadget source
+
----------------
mattdr wrote:
> I think this is just intended as an optimization -- it's not necessary for correctness. Assuming that's right, suggest removing it since it doesn't really make things faster but it does add some extra complexity.
Why do you think it doesn't make things faster? A majority of nodes in the graph are not gadget sinks, and this majority tends to grow rapidly as gadgets become mitigated. Each time this check passes, it saves an entire DFS traversal and one `O(E)` loop through the edges.

================
Comment at: llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp:559
+    // Find all of the nodes that are reachable from RootN
+    std::function<void(const MachineGadgetGraph::Node *, bool)> TraverseDFS =
+        [&](const MachineGadgetGraph::Node *N, bool FirstNode) {
----------------
mattdr wrote:
> Maybe `FindCFGReachableGadgetSinksDFS`?
Similar to above, this is not accurate as it finds all reachable nodes, not just gadget sinks. I did rename to `FindReachableNodes` and added detail to the comment.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75937/new/

https://reviews.llvm.org/D75937