[clang] c49df15 - [dfsan][NFC] Describe how origin trace tracking works

Jianzhou Zhao via cfe-commits cfe-commits at lists.llvm.org
Tue Jul 27 14:11:20 PDT 2021


Author: Jianzhou Zhao
Date: 2021-07-27T21:10:39Z
New Revision: c49df15c278857adecd12db6bb1cdc96885f7079

URL: https://github.com/llvm/llvm-project/commit/c49df15c278857adecd12db6bb1cdc96885f7079
DIFF: https://github.com/llvm/llvm-project/commit/c49df15c278857adecd12db6bb1cdc96885f7079.diff

LOG: [dfsan][NFC] Describe how origin trace tracking works

Reviewed By: gbalats

Differential Revision: https://reviews.llvm.org/D106903

Added: 
    

Modified: 
    clang/docs/DataFlowSanitizerDesign.rst

Removed: 
    


################################################################################
diff  --git a/clang/docs/DataFlowSanitizerDesign.rst b/clang/docs/DataFlowSanitizerDesign.rst
index ea40fe332010..bed4d2f38cba 100644
--- a/clang/docs/DataFlowSanitizerDesign.rst
+++ b/clang/docs/DataFlowSanitizerDesign.rst
@@ -135,6 +135,35 @@ Users are responsible for managing the 8 integer labels (i.e., keeping
 track of what labels they have used so far, picking one that is yet
 unused, etc).
 
+Origin tracking trace representation
+------------------------------------
+
+An origin tracking trace is a list of chains. Each chain has a stack trace
+where the DFSan runtime records a label propapation, and a pointer to its
+previous chain. The very first chain does not point to any chain.
+
+Every four 4-bytes aligned application bytes share a 4-byte origin trace ID. A
+4-byte origin trace ID contains a 4-bit depth and a 28-bit hash ID of a chain.
+
+A chain ID is calculated as a hash from a chain structure. A chain structure
+contains a stack ID and the previous chain ID. The chain head has a zero
+previous chain ID. A stack ID is a hash from a stack trace. The 4-bit depth
+limits the maximal length of a path. The environment variable ``origin_history_size``
+can set the depth limit. Non-positive values mean unlimited. Its default value
+is 16. When reaching the limit, origin tracking ignores following propagation
+chains.
+
+The first chain of a trace starts by `dfsan_set_label` with non-zero labels. A
+new chain is appended at the end of a trace at stores or memory transfers when
+``-dfsan-track-origins`` is 1. Memory transfers include LLVM memory transfer
+instructions, glibc memcpy and memmove. When ``-dfsan-track-origins`` is 2, a
+new chain is also appended at loads.
+
+Other instructions do not create new chains, but simply propagate origin trace
+IDs. If an instruction has more than one operands with non-zero labels, the origin
+treace ID of the last operand with non-zero label is propagated to the result of
+the instruction.
+
 Memory layout and label management
 ----------------------------------
 


        


More information about the cfe-commits mailing list