[PATCH] D92261: [dfsan] Track field/index-level shadow values in variables

Tue Dec 8 17:07:50 PST 2020

stephan.yichao.zhao marked 15 inline comments as done.
stephan.yichao.zhao added inline comments.

================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:788
+  if (CS.Block && DT.dominates(CS.Block, Pos->getParent()) &&
+      DT.dominates(CS.Shadow, Pos))
+    return CS.Shadow;
----------------
morehouse wrote:
> stephan.yichao.zhao wrote:
> > morehouse wrote:
> > > This additional check isn't used for the CCS cache.  Why do we need it here?
> > The use of the CCS cache assumes that in the same block instructions are visited sequentially. So it does not need to check domination inside the same block.
> > 
> > CachedShadow2PrimitiveShadow has a case [[ https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp#L1090 | here ]].
> > Somehow it may insert a new instruction in a reversed order because of phi node insertion and because this ClDebugNonzeroLabels feature is a post-process.
> > So we added this to ensure in the same block CS.Shadow dominates Pos. 
> Please add a comment explaining this.
> 
> If `CS.Shadow` dominates `Pos`, do we still need to check that `CS.Block` dominates `Pos->getParent()`?
Added the comments at CachedCollapsedShadows. Actually we only need to catch values since we do not need to check Block dominations if we have to check dominations between values.

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/abilist_aggregate.ll:181
+  ; TLS_ABI: store [2 x i16] zeroinitializer, [2 x i16]* inttoptr (i64 add (i64 ptrtoint ([100 x i64]* @__dfsan_arg_tls to i64), i64 4) to [2 x i16]*), align [[ALIGN]]
+  ; TLS_ABI: %_dfsret = load { i16, i16 }, { i16, i16 }* bitcast ([100 x i64]* @__dfsan_retval_tls to { i16, i16 }*), align [[ALIGN]]
+
----------------
morehouse wrote:
> I'm not familiar with the custom wrapper logic.  Why does this function store 0 to arg TLS?
This seems the effect of the existing design.

dfs$call_custom_cb calls the customized @__dfsw_custom_cb.
The user-provided @__dfsw_custom_cb calls @call_custom_cb.

And the instrumentation around "call %cb" inside @call_custom_cb is by this [[ https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp#L767 | visitCallInst ]]. 

This visitor is from
```
   DFSanFunction DFSF(*this, F, /*IsNativeABI=*/true);
```

When IsNativeABI is one, getShadow always returns [[ https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp#L1153 | 0 ]] for arguments, and 
@custom_cb [[ https://github.com/llvm/llvm-project/blame/main/llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp#L1671 | does not ]] return shadow.

@__dfsw_custom_cb receives dfst0$custom_cb and @"dfs$cb" instead of @cb.
@"dfs$cb" is a dfsan @cb that uses TLS to pass shadows at args/ret, while dfst0$custom_cb is a wrapper of @"dfs$cb", and dfst0$custom_cb passes shadows by additional arguments.

I think in @__dfsw_custom_cb, users' code is like

```
def @__dfsw_custom_cb(@"dfst0$custom_cb, @"dfs$cb", arg1, arg2, ..., arg_shadow1, arg_shadow2, ..., ret_shadow) {
    auto cb_ret_shadow;
    auto my_cb = [&] (...) {
      ...
      ret @"dfst0$custom_cb"(@"dfs$cb", cb_arg1, cb_arg2, ..., cb_arg_shadow1, cb_arg_shadow2, ..., cb_ret_shadow);
    }
    auto r = @custom_cb(my_cb, ...)
    // set ret_shadow in terms of all shadows...
}
```

The puzzle is that all arguments shadow are [[ https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp#L766 | assigned ]] to this DFSF, although they should never be used. It is from the [[ https://github.com/llvm/llvm-project/commit/28a10aff4828bd07cfda28ccc6048c775b5b9f60 | first version ]].

I am not sure this is supposed to be the correct custom wrapper of a function with callbacks.

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/array.ll:213
+  ; LEGACY: [[SP1:%.*]] = getelementptr i16, i16* [[SP]], i32 1
+  ; LEGACY: store i16 [[S]], i16* [[SP1]], align [[ALIGN]]
+  
----------------
morehouse wrote:
> Why are there two stores to `SP`?  `2 x i1` is less than 1 byte, so wouldn't a single i16 shadow be enough?
> 
> Or is there a hidden 1 byte alignment in the array?
This is defined by the above data layout. It defines i1:8:8. So each i1 takes 1 byte, and we have 2-byte shadow for each 1 byte.

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/array.ll:245
+  ; FAST16: [[E:%.*]] = extractvalue [17 x i16] {{.*}}, 16
+  ; FAST16: [[S:%.*]] = or i16 {{.*}}, [[E]]
+  ; FAST16: [[V1:%.*]] = insertelement <8 x i16> undef, i16 [[S]], i32 0
----------------
morehouse wrote:
> Shouldn't there be more ORs for each element in `%a`?
This is where the current diff loses accuracy. 
When saving an aggregate value into memory, we call that collapse function to convert an accurate shadow to a i16 label.
So this diff only increases accuracy for variables, arguments and ret.

This works for O1-compiled targets, because alloca premotion removes lots of memory operations, and practice code does not save aggregate types to memory.
If we build by O0, it does not work as those pair.cc and struct.c test.

We need to address this in the next change.

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/array.ll:260
+  ; FAST16: [[P3:%.*]] = getelementptr i16, i16* [[P]], i32 16
+  ; FAST16: store i16 [[S]], i16* [[P3]], align [[ALIGN]]
+  store [17 x i1] %a, [17 x i1]* %p
----------------
morehouse wrote:
> What does this last store to "P3" do?
This is testing the loop from [[ https://github.com/llvm/llvm-project/blame/main/llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp#L1502 | here ]] to [[ https://github.com/llvm/llvm-project/blame/main/llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp#L1527 | here ]].

when a data to store is large, it first saves vectors, then saves the rest as primitive types.
The instructions before P3 are about vector saving, the rest are for primitive-data saving.

This is another reason this diff collapses aggregate shadow to i16. It makes the code still reuse the same code for storing/loading.

The next change that preserves aggregate accuracy needs to redesign this logic.

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/struct.ll:20
+  
+  ; DEBUG_NONZERO_LABELS: @"dfs$pass_struct"
+  ret {i8*, i32} %s
----------------
morehouse wrote:
> What's the reason for having `DEBUG_NONZERO_LABELS` here when it tests nothing interesting?
Thank you for catching this. Added.

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/union-large.ll:3013
+  %ab = select i1 %c, {i32, i32} %a, {i32, i32} %b
+  ret {i32, i32} %ab
 }
----------------
morehouse wrote:
> What's the reason for changing this test?
this is not necessary. removed. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92261/new/

https://reviews.llvm.org/D92261