[all-commits] [llvm/llvm-project] bf6986: [TSan, SanitizerBinaryMetadata] Improve instrument...

Camsyn via All-commits all-commits at lists.llvm.org
Thu Apr 17 01:09:30 PDT 2025


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: bf6986f9f09f79da38006a83c339226c429bb686
      https://github.com/llvm/llvm-project/commit/bf6986f9f09f79da38006a83c339226c429bb686
  Author: Camsyn <camsyn at foxmail.com>
  Date:   2025-04-17 (Thu, 17 Apr 2025)

  Changed paths:
    M llvm/lib/Transforms/Instrumentation/SanitizerBinaryMetadata.cpp
    M llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp
    M llvm/test/Instrumentation/ThreadSanitizer/capture.ll

  Log Message:
  -----------
  [TSan, SanitizerBinaryMetadata] Improve instrument for derived pointers via phis/selects (#132752)

ThreadSanitizer.cpp and SanitizerBinaryMetadata.cpp previously used
`getUnderlyingObject` to check if pointers originate from stack objects.

However, `getUnderlyingObject()` by default only looks through linear
chains, not selects/phis. In particular, this means that we miss cases
involving pointer induction variables.

For instance,
```llvm
%stkobj = alloca [2 x i32], align 8
; getUnderlyingObject(%derived) = %derived
%derived = getelementptr inbounds i32, ptr %stkobj, i64 1
```

This will result in redundant instrumentation of TSan, resulting in
greater performance costs, especially when there are loops, referring to
this [godbolt page](https://godbolt.org/z/eaT1fPjTW) for details.
```cpp
char loop(int x) {
    char buf[10];
    char *p = buf;
    for (int i = 0; i < x && i < 10; i++) {
      // Should not instrument, as its base object is a non-captured stack
      // variable.
      // However, currectly, it is instrumented due to %p = %phi ...
      *p++ = i;
    }

    // Use buf to prevent it from being eliminated by optimization
    return buf[9];
}
```

There are TWO APIs `getUnderlyingObjectAggressive` and
`findAllocaForValue` that can backtrack the pointer via tree traversal,
supporting phis/selects.

This patch replaces `getUnderlyingObject` with `findAllocaForValue`
which:
1. Properly tracks through PHINodes and select operations
2. Directly identifies if a pointer comes from a `AllocaInst`

Performance impact:
- Compilation: Moderate cost increase due to wider value tracing, but...
- Runtime: Significant wins for code with pointer induction variables
derived from stack allocas, especially for loop-heavy code, as
instrumentation can now be safely omitted.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list