[PATCH] D92261: [dfsan] Track field/index-level shadow values in variables

Tue Dec 8 10:44:28 PST 2020

morehouse added inline comments.

================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:215

+static Value *expandFromPrimitiveShadowRecursive(
+    Value *Shadow, SmallVector<unsigned, 4> &Indices, Type *SubShadowTy,
----------------
Nit: Moving this helper function closer to where it's used would improve readability.

================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:546
+  /// CFP(other types, PS) = PS
+  Value *convertFromPrimitiveShadow(Type *T, Value *PrimitiveShadow,
+                                    Instruction *Pos);
----------------
WDYT about renaming to `expandFromPrimitiveShadow`?

================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:1705
       IRBuilder<> IRB(Pos);
-      IRB.CreateStore(Shadow, i->second);
+      // Value *PrimitiveShadow = collapseToPrimitiveShadow(Shadow, Pos);
+      IRB.CreateStore(PrimitiveShadow, i->second);
----------------
Please delete commented-out code here and elsewhere.

================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:788
+  if (CS.Block && DT.dominates(CS.Block, Pos->getParent()) &&
+      DT.dominates(CS.Shadow, Pos))
+    return CS.Shadow;
----------------
stephan.yichao.zhao wrote:
> morehouse wrote:
> > This additional check isn't used for the CCS cache.  Why do we need it here?
> The use of the CCS cache assumes that in the same block instructions are visited sequentially. So it does not need to check domination inside the same block.
> 
> CachedShadow2PrimitiveShadow has a case [[ https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp#L1090 | here ]].
> Somehow it may insert a new instruction in a reversed order because of phi node insertion and because this ClDebugNonzeroLabels feature is a post-process.
> So we added this to ensure in the same block CS.Shadow dominates Pos. 
Please add a comment explaining this.

If `CS.Shadow` dominates `Pos`, do we still need to check that `CS.Block` dominates `Pos->getParent()`?

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/abilist_aggregate.ll:181
+  ; TLS_ABI: store [2 x i16] zeroinitializer, [2 x i16]* inttoptr (i64 add (i64 ptrtoint ([100 x i64]* @__dfsan_arg_tls to i64), i64 4) to [2 x i16]*), align [[ALIGN]]
+  ; TLS_ABI: %_dfsret = load { i16, i16 }, { i16, i16 }* bitcast ([100 x i64]* @__dfsan_retval_tls to { i16, i16 }*), align [[ALIGN]]
+
----------------
I'm not familiar with the custom wrapper logic.  Why does this function store 0 to arg TLS?

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/abilist_aggregate.ll:191
+  ; TLS_ABI: {{.*}} = load { i16, i16 }, { i16, i16 }* bitcast ([100 x i64]* @__dfsan_arg_tls to { i16, i16 }*), align [[ALIGN]]
+  ; TLS_ABI: store { i16, i16 } {{.*}}, { i16, i16 }* bitcast ([100 x i64]* @__dfsan_retval_tls to { i16, i16 }*), align [[ALIGN]]
+
----------------
This function should store `{a1, b0}` shadow to retval TLS, right?  Should we verify that?

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/array.ll:213
+  ; LEGACY: [[SP1:%.*]] = getelementptr i16, i16* [[SP]], i32 1
+  ; LEGACY: store i16 [[S]], i16* [[SP1]], align [[ALIGN]]
+  
----------------
Why are there two stores to `SP`?  `2 x i1` is less than 1 byte, so wouldn't a single i16 shadow be enough?

Or is there a hidden 1 byte alignment in the array?

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/array.ll:245
+  ; FAST16: [[E:%.*]] = extractvalue [17 x i16] {{.*}}, 16
+  ; FAST16: [[S:%.*]] = or i16 {{.*}}, [[E]]
+  ; FAST16: [[V1:%.*]] = insertelement <8 x i16> undef, i16 [[S]], i32 0
----------------
Shouldn't there be more ORs for each element in `%a`?

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/array.ll:260
+  ; FAST16: [[P3:%.*]] = getelementptr i16, i16* [[P]], i32 16
+  ; FAST16: store i16 [[S]], i16* [[P3]], align [[ALIGN]]
+  store [17 x i1] %a, [17 x i1]* %p
----------------
What does this last store to "P3" do?

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/struct.ll:20
+  
+  ; DEBUG_NONZERO_LABELS: @"dfs$pass_struct"
+  ret {i8*, i32} %s
----------------
What's the reason for having `DEBUG_NONZERO_LABELS` here when it tests nothing interesting?

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/struct.ll:260
+
+declare %StructOfAggr @fun_with_mang_aggr_args(<2 x i7> %v, [2 x i5] %a, {i3, i3} %s)
+
----------------

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/struct.ll:273
+
+  %r = call %StructOfAggr @fun_with_mang_aggr_args(<2 x i7> %v, [2 x i5] %a, {i3, i3} %s)
+  ret %StructOfAggr %r
----------------

================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/union-large.ll:3013
+  %ab = select i1 %c, {i32, i32} %a, {i32, i32} %b
+  ret {i32, i32} %ab
 }
----------------
What's the reason for changing this test?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92261/new/

https://reviews.llvm.org/D92261