[PATCH] D92261: [dfsan] Track field/index-level shadow values in variables

Tue Dec 8 01:25:33 PST 2020

stephan.yichao.zhao added a comment.

In D92261#2437862 <https://reviews.llvm.org/D92261#2437862>, @morehouse wrote:

> How does this change affect instrumented binary size?

For the same large application used by D92440 <https://reviews.llvm.org/D92440>, this diff added 0.004% code size overhead. 
Because the change of D92440 <https://reviews.llvm.org/D92440> reduces 0.1%, so after the diff, the code size is still smaller.
I think this is because returning or passing struct/array is rare in C++ code, although if this happens at critical code path, it introduces overtaint as the motivation example used in the description.

We also looked into a small code to see how this changes code gen. Interestingly, somehow at TLS part, the diff's code is slightly smaller.
For example, for a C code

  typedef struct Pair {
    int i;
    char *ptr;
  } Pair;
  Pair make_pair(int i, char *ptr) {
    Pair pair;
    pair.i = i;
    pair.ptr = ptr;
    return pair;
  }

At ret. the old dfsan does

  mov    %rsi,%rdx
  mov    %edi,%eax
  mov    0x0(%rip),%rcx 
  movzwl %fs:(%rcx),%esi
  or     %fs:0x2(%rcx),%si
  mov    0x0(%rip),%rcx 
  mov    %si,%fs:(%rcx)
  retq

the new dfsan does

  mov    %rsi,%rdx
  mov    %edi,%eax
  mov    0x0(%rip),%rcx 
  mov    %fs:(%rcx),%ecx
  mov    0x0(%rip),%rsi  
  mov    %ecx,%fs:(%rsi)
  retq

We can see although Pair has two fields, it is represented by one register.
Because the old dfsan needs to union the two fields before returning, this seems a more complicated register-level code because of unioning a register itself.., while the new dfsan does not do such a union.

I feel there could be cases where the new dfsan may generate more code.
But because in a large application, passing/returning complicated struct is not common, it seems fine.

================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:520
+  /// CTP(other types, PS) = PS
+  Value *convertToPrimitiveShadow(Value *Shadow, Instruction *Pos);

----------------
morehouse wrote:
> Nit:  I think `expandFromPrimitiveShadow` and `collapseToPrimitiveShadow` are more intuitive names.
renamed to collapseToPrimitiveShadow

================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:679
+
+Value *DFSanFunction::convertFromPrimitiveShadowByShadowType(
+    Value *Shadow, SmallVector<unsigned, 4> &Indices, Type *SubShadowTy,
----------------
morehouse wrote:
> Nit:  This is really just a recursive helper function.  Maybe a name like `expandFromPrimitiveShadowRecursive` is more descriptive?  Also since it doesn't access any member variables, it could just be a local static function.
moved it to a static function expandFromPrimitiveShadowRecursive.

================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:746
+    Value *ShadowInner = convertToPrimitiveShadow(ShadowItem, IRB);
+    Aggregator = IRB.CreateOr(Aggregator, ShadowInner);
+  }
----------------
morehouse wrote:
> OR only works for fast16 mode.
Thank you for catching this! My original test case worked for non-fast16 accidentially because it used only labels 1 and 2..., and our large applications use only fast16 mode..

The or operation of the legacy mode is more complicated; plus, introducing more 'or' can consume legacy labels more quickly. So it is not clear if it is a good trade-off between accuracy and running-out-of-labels...

I added a shouldTrackFieldsAndIndices method to ensure this diff only enables field-level shadow for fast16 mode. none-fast16 mode needs more evaluation in the next step.

================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:768
+Value *DFSanFunction::convertToPrimitiveShadow(Value *Shadow,
+                                               IRBuilder<> &IRB) {
+  Type *ShadowTy = Shadow->getType();
----------------
morehouse wrote:
> Since this function is small, I think we should inline it into `convertToPrimitiveShadow` below.
This convertToPrimitiveShadow and the above collapseArrayShadow make a mutual recursion.

The convertToPrimitiveShadow below is like the main entry point. 

================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:788
+  if (CS.Block && DT.dominates(CS.Block, Pos->getParent()) &&
+      DT.dominates(CS.Shadow, Pos))
+    return CS.Shadow;
----------------
morehouse wrote:
> This additional check isn't used for the CCS cache.  Why do we need it here?
The use of the CCS cache assumes that in the same block instructions are visited sequentially. So it does not need to check domination inside the same block.

CachedShadow2PrimitiveShadow has a case [[ https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp#L1090 | here ]].
Somehow it may insert a new instruction in a reversed order because of phi node insertion and because this ClDebugNonzeroLabels feature is a post-process.
So we added this to ensure in the same block CS.Shadow dominates Pos. 

================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:1432
+  return convertFromPrimitiveShadow(T, PrimitiveValue, Pos);
+}
+
----------------
morehouse wrote:
> Nit:  This function is tiny.  Maybe we should just inline it everywhere instead?
It was inlined before. I found it is used many times, so made it a function.

================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:1438
   if (DFS.isZeroShadow(V1))
-    return V2;
+    return convertToPrimitiveShadow(V2, Pos);
   if (DFS.isZeroShadow(V2))
----------------
morehouse wrote:
> Do you plan to make this work without converting everything to primitive shadow?
Yes. 

The conversion or collapsing happens in mainly the following cases

1) memory operations
The next step is making load/store/alloca use non-collapsed values. This is required for O0-compiled code. O0 does not promote alloca to variables. So most struct operations go through memory...

2) ClCombinePointerLabelsOnStore/ClCombinePointerLabelsOnLoad/ClTrackSelectControlFlow
When ClCombinePointerLabelsOnStore/ClCombinePointerLabelsOnLoad/ClTrackSelectControlFlow are true, we can also do field-level combining rather than combining-then-union.

3) custom function wrapper
It is not clear if any wrapper needs struct parameters or ret values yet

4) when combineShadows is used by combineOperandShadows
I feel in this case most operands are not aggregate types except select, insert/extract, which are already considered separately. If any op can use aggregate types, we could treat them specially.

In the above 4 cases, probably we wanted to see how non-collapsed propagation changes code size to get a better trade-off.

================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:1786
+    Shadow = DFSF.combineShadowsThenConvert(SI.getValueOperand()->getType(),
+                                            Shadow, PtrShadow, &SI);
   }
----------------
morehouse wrote:
> What's the reason for expanding the shadow when we're about to collapse it again in `storeShadow`?
Good catch! I renamed storeShadow to be storePrimitiveShadow to indicate it reads primitive values, and we renamed unnecessary conversions.

W/o the change, it indeed generates dead-code, although the following pass may remove them.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92261/new/

https://reviews.llvm.org/D92261