[PATCH] D100967: [dfsan] Track origin at loads
George Balatsouras via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 21 11:13:37 PDT 2021
gbalats added inline comments.
================
Comment at: compiler-rt/test/dfsan/origin_track_ld.c:2
+// RUN: %clang_dfsan -gmlt -mllvm -dfsan-track-origins=2 -mllvm -dfsan-fast-16-labels=true %s -o %t && \
+// RUN: %run %t >%t.out 2>&1
+// RUN: FileCheck %s --check-prefix=CHECK < %t.out
----------------
================
Comment at: compiler-rt/test/dfsan/origin_track_ld.c:3
+// RUN: %run %t >%t.out 2>&1
+// RUN: FileCheck %s --check-prefix=CHECK < %t.out
+//
----------------
This is redundant as it's the default.
================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:256-262
// Controls how to track origins.
// * 0: do not track origins.
// * 1: track origins at memory store operations.
-// * 2: TODO: track origins at memory store operations and callsites.
+// * 2: track origins at memory load and store operations.
static cl::opt<int> ClTrackOrigins("dfsan-track-origins",
cl::desc("Track origins of labels"),
cl::Hidden, cl::init(0));
----------------
Using different integer values to encode the level of tracking is hard to understand without looking at this exact comment right here. Why can't we use an enum instead with descriptive names?
E.g.,
```
enum OriginTrackingLevel {
None,
StoresOnly,
LoadsAndStores
};
```
https://llvm.org/docs/CommandLine.html#selecting-an-alternative-from-a-set-of-possibilities
================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:644
+ /// Generates IR to load shadow and origin corresponding to bytes [Addr,
+ /// Addr+Size), where Addr has alignment Align, and take the union of each of
+ /// those shadows. The returned shadow always has primitive type.
----------------
You mean InstAlignment? By the way, for documenting specific arguments of functions, I think the special `\param` Doxygen syntax would be clearer.
https://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments
================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:647
+ ///
+ /// When enabling tracking loads, the returned origin is a chain at the
+ /// current stack if the returned shadow shadow is tainted.
----------------
================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:709
+ ///
+ /// When enabling tracking load instructions, we always use
+ /// __dfsan_load_label_and_origin to reduce code size.
----------------
================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:745
+ /// those shadows. The returned shadow always has primitive type.
+ std::pair<Value *, Value *> loadShadowOriginSansTracking(Value *Addr,
+ uint64_t Size,
----------------
What do you mean by this? The description doesn't indicate what's the different with `loadShadowOrigin`. Maybe `loadShadowOriginSansLoadTracking`?
================
Comment at: llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp:2441-2442
+ IRBuilder<> &IRB) {
+ if (!DFS.shouldTrackOrigins())
+ return Origin;
+ return IRB.CreateCall(DFS.DFSanChainOriginIfTaintedFn, {Shadow, Origin});
----------------
This function should be called only when tracking origins. Why not make that an assertion, instead of the if-stmt?
================
Comment at: llvm/test/Instrumentation/DataFlowSanitizer/origin_track_load.ll:1-2
+; RUN: opt < %s -dfsan -dfsan-track-origins=2 -dfsan-fast-8-labels -S | FileCheck %s --check-prefixes=CHECK
+; RUN: opt < %s -dfsan -dfsan-track-origins=2 -dfsan-fast-16-labels -S | FileCheck %s --check-prefixes=CHECK
+target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
----------------
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D100967/new/
https://reviews.llvm.org/D100967
More information about the llvm-commits
mailing list