[PATCH] D60593: [GwpAsan] Introduce GWP-ASan.

Fri Apr 26 15:33:31 PDT 2019

hctim added a comment.

Allocator fast path CFG (@morehouse, @vlad.tsyrklevich) :
F8777483: Screenshot from 2019-04-26 14-01-52.png <https://reviews.llvm.org/F8777483>

================
Comment at: compiler-rt/CMakeLists.txt:281
+option(GWP_ASAN_SCUDO_HOOKS
+  "When building scudo, should we install GWP-ASan hooks (default=False)?" OFF)
+pythonize_bool(GWP_ASAN_SCUDO_HOOKS)
----------------
morehouse wrote:
> I think the "(default=False)" is unnecessary since the default value is clearly "OFF".
> 
> Also, shouldn't this go in the Scudo cmake?
Done. Also turned this default to True for supported platforms.

Needs to be defined this high in the build as it's used by `compiler-rt/test/lit.common.configured`.

================
Comment at: compiler-rt/cmake/config-ix.cmake:675
+# TLS allocations on first-use.
+if (COMPILER_RT_HAS_SANITIZER_COMMON AND GWP_ASAN_SUPPORTED_ARCH AND
+    OS_NAME MATCHES "Android|Linux")
----------------
morehouse wrote:
> Does this mean GWP-ASan requires sanitizer_common to be present?
Thanks, fixed. Only for sanitizer-common backed default initialisation of `gwp_asan::Options` (which is not required if the implementing allocator creates the options themselves).

================
Comment at: compiler-rt/lib/gwp_asan/CMakeLists.txt:60
+  # include the sanitizer_common flag parsing object lib
+  # 'RTSanitizerCommonNoTermination'.
+  add_compiler_rt_object_libraries(RTGwpAsanFlagParser
----------------
morehouse wrote:
> What does RTSanitizerCommonNoTermination do?
Provides argument parsing and printf from sanitizer_common (and also will provide unwinding in the future). This is only for allocators to use if they don't want to populate the Options struct themselves, and are okay with depending on sanitizer_common.

================
Comment at: compiler-rt/lib/gwp_asan/allocation_metadata.h:39
+  // Whether this allocation has been deallocated yet.
+  bool IsDeallocated = false;
+};
----------------
morehouse wrote:
> Probably don't need this.  Just set `DeallocationTrace[0] = 0` on alloc.
At the moment, `DeallocationTrace[0] = 0` identifies that no trace was collected. I think we still need this for the moment.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:182
+
+  bool ShouldRightAlign = (*Size % 2 == 1);
+
----------------
vlad.tsyrklevich wrote:
> hctim wrote:
> > vlad.tsyrklevich wrote:
> > > In Chrome we left- or right-align based on whether the (randomly chosen) slot is even/odd, that means an allocation will test for both under- and over-flows across different runs. However in this case we always left-/right-align based on the size which means a given type will only ever be tested one way (and I'd expect we have a lot more even allocations than odd allocations.)
> > Gotcha. In this case, we need a runtime option (like the previous `AlignmentStrategy`) to guarantee left && right alignment for testing, as we don't want flaky tests.
> > 
> > How would you prefer to go about this? Two runtime options, `NoRandomAlignment` and `AlignRight`?
> Why not just have the tests either 1) Use full-page allocations so that we know exactly where the bounds are or 2) a helper routine to get a left-/right-aligned allocation? Those are the two things I end up using in Chrome and they've been sufficient.
FYI - I have adjusted this to be purely PRNG based as if `NumGuardedSlots` is odd, it will give us a higher probability of left-alignment. If `NumGuardedSlots == 1`, we want to still do random left/right alignment.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:62
+  else
+    __atomic_store_n(&AdjustedSampleRate, 1, __ATOMIC_RELAXED);
+
----------------
morehouse wrote:
> What if `Opts.SampleRate == 0`?
Shouldn't be allowed - the default flag parser has the check:
```
if (o->Enabled && o->SampleRate < 1) {
    __sanitizer::Printf("GWP-ASan ERROR: SampleRate must be >= 1 when "
                        "GWP-ASan is enabled.\n");
    exit(EXIT_FAILURE);
  }
```

... but allocators can implement their own flag parser. I've added an explicit check in `init()` to consider `SampleRate == 0` as being the same as `Enabled == false`.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:76
+                   __ATOMIC_RELAXED);
+  __atomic_store_n(&IsInitialised, true, __ATOMIC_RELAXED);
+}
----------------
morehouse wrote:
> I think we need to hold `PoolMutex` during `init`.  Otherwise we have no guarantees that everything else is initialized once `IsInitialized = true` (instructions may be executed out-of-order).
Fixed by thread-hostile requirements described above.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:90
+  uintptr_t Ptr = slotToAddr(Index);
+  markReadWrite(reinterpret_cast<void *>(Ptr), maximumAllocationSize());
+
----------------
morehouse wrote:
> Suppose `Size <= PageSize < maximumAllocationSize()`.  In this case, we can `mprotect` a single page rather than the whole slot to improve bug-detection.
That's an awesome catch! Thanks!

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:155
+uintptr_t GuardedPoolAllocator::getPageAddr(uintptr_t Ptr) const {
+  return Ptr & ~(static_cast<uintptr_t>(PageSize) - 1);
+}
----------------
morehouse wrote:
> This rounds down to page size, not slot size.  So we will have unintended behavior when `AllocSize < PageSize < SlotSize` and the alloc is right-aligned.
Yep, this is intended, but one of the `deallocate()` was using this incorrectly (where they wanted slot address instead of page address). I've updated the callee.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:168
+  if (NumUsedSlots < NumGuardedSlots) {
+    ReservedIndex = NumUsedSlots++;
+  } else {
----------------
vlad.tsyrklevich wrote:
> This algorithm doesn't work, imagine NumGuardedSlots=4. Then you first start off with 0 1 2 3 + (The plus donates FreeSlotsLength.) Then after an allocation you have 3 1 2 + 3, after another allocation you have 3 2 + 2 3. Now your ReservedIndex reads past the end of FreeSlotsLength which is wrong. If you free the first pointer you get 3 2 0 + 3. Another allocation 3 2 + 0 3, then 3 + 2 0 2. Now you've hit NumUsedSlots == NumGuardedSlots and you'll allocate slot 3 (again.)
> 
> This highlights the importance of tests like AllocDeallocAllPages, ThreadedAllocCount, and ThreadedHighContention in https://github.com/chromium/chromium/blob/master/components/gwp_asan/client/guarded_page_allocator_unittest.cc
> 
> We need to add tests ensuring that in both single- and multi-threaded contexts we are never returning slots twice.
Furthermore, this also had an issue where (if it worked correctly in the first place), it lead to more deterministic left/right alignment, as the alignment choice was made based on the slot number.

I've fixed this and will add some more tests before submitting. Leaving this comment open to remind me to do this at a later date, but focus right now is getting this out for another round of comments.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:206
+  if (SingletonPtr == nullptr)
+    exit(EXIT_FAILURE);
+  SingletonPtr->reportErrorAndDieInternal(AccessPtr, Error);
----------------
morehouse wrote:
> We should have an error message so the user knows what happened.
At this point, we don't have access to a `printf()` library. 

Also, I'm not sure whether this is actually reachable. SingletonPtr is always initialised in this TU before the signal handlers are installed. Have changed this to an assert.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:262
+  case GuardedPoolAllocator::GwpAsanError::UNKNOWN:
+    (*Printf)(
+        "The cause of the error is indeterminate. GWP-ASan couldn't "
----------------
morehouse wrote:
> Nit:  I think `Printf(...)` suffices.
I didn't know this, thanks!

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:298
+  (*Printf)("0x%zx is located ", AccessPtr);
+  if (AccessPtr < Meta->Addr)
+    (*Printf)("%zu bytes to the left ", Meta->Addr - AccessPtr);
----------------
vlad.tsyrklevich wrote:
> This will print "0 bytes to the right" on UAF at the address of the allocation. Perhaps elide the left/right calculation and just print the info for the allocation.
Could this be useful for UaF where there is a nonzero offset? Have elided for zero offsets only, as I figure it's better to keep the information when it may be of use.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:338
+  if (!pointerIsMine(reinterpret_cast<void *>(AccessPtr))) {
+    Error = GwpAsanError::UNKNOWN;
+  } else if (Error == GwpAsanError::UNKNOWN) {
----------------
morehouse wrote:
> Hmm... So any segfault will trigger a GWP-ASan report, even if its very far from the guarded region?
Have changed to simply echo "Segmentation fault" and exit.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.h:74
+    uintptr_t P = reinterpret_cast<uintptr_t>(Ptr);
+    return GetGuardedPagePool() <= P && P < GetGuardedPagePoolEnd();
+  }
----------------
morehouse wrote:
> vlad.tsyrklevich wrote:
> > hctim wrote:
> > > vlad.tsyrklevich wrote:
> > > > We've given GuardedPagePool/GuardedPagePoolEnd the atomic treatment but since we access them sequentially it's possible that we get the sequence:
> > > > T1 Load GuardedPagePool
> > > > T2 Lazy Init
> > > > T1 Load GuardedPagePoolEnd
> > > > 
> > > > and give a bad value if the pointer is between 0 and GuardedPagePoolEnd. It seems like these values should not be atomic but that instead we are forced to perform an IsInitialised check first instead.
> > > Alternatively, we could set `GuardedPagePool = UINTPTR_MAX` in the constructor for value-initialisation, and not have the overhead of the check. Do you see any problems with this?
> > I'm not familiar with the atomic orderings but it seems like technically that's still not correct because __ATOMIC_RELAXED doesn't guarantee ordering (though I suppose loading IsInitialized here wouldn't fix that either.) We might need to change the memory order of the writes (hopefully not the reads for optimality) and then we could make that scheme work. (Make sure to document it somewhere.)
> Perhaps we can require that GPA has been initialized prior to calling `pointerIsMine`.
> 
> Scudo has some code like
> ```
> allocate() {
>   initScudoIfNecessary();
>   ...
> }
> ```
> 
> If we ensure that GPA is initialized inside `initScudo`, we guarantee that `PointerIsMine` is only ever called after `GPA::Init`.
So upon further reading, looks like the only way to ensure atomic-correctness here is to establish a dependency between GPP and GPPEnd, which require sequentially consistent loads and stores (which would hurt performance a trememdous amount).

We currently do what Matt suggests, so I've updated the documentation to reflect that the entire class is thread-hostile until `init()` is called.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.h:67
+    return NextSampleCounter-- == 1 &&
+           __atomic_load_n(&IsInitialised, __ATOMIC_RELAXED);
+  }
----------------
morehouse wrote:
> Can perf be improved by marking the `== 1` branch as `UNLIKELY`?
It wont on x64 as AFAIK the static predictors can't be primed (and doesn't appear to change the optimiser), but have as it may affect other other platforms.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.h:113
+  void markReadWrite(void *Ptr, std::size_t Size) const;
+  void markInaccessible(void *Ptr, std::size_t Size) const;
+
----------------
morehouse wrote:
> Can these be static?
They may access `GPA::Printf` on failure. Would you prefer them to be static w/ passing `Printf` as an argument?

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.h:162
+  // slots are available.
+  bool reserveSlot(std::size_t *PageIndex);
+
----------------
morehouse wrote:
> vlad.tsyrklevich wrote:
> > morehouse wrote:
> > > `ssize_t reserveSlot()` seems cleaner to me.  Also the function comment could be reworded clearer.
> > I think this is because it's modeled on my implementation since with my design I reserve slots/metadata indices separately.
> With `ssize_t` return value, we can still detect failure (-1) while not having to pass a return param.  What's the benefit of the current setup?
For future expansion, we will want to populate a slot index and a separate metadata index. I've reworded up the comment but left the implementation as-is for now. WDYT?

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.h:171
+  // slot in which the allocation should live.
+  void applyAllocationStrategy(std::size_t Size, uintptr_t *AllocationPtr);
+
----------------
morehouse wrote:
> Confusing interface and comment.  How about `uintptr_t AlignAllocation(size_t SlotIdx, size_t Size)`?
I had a good think about the best way to do this adjustment. Does the rework LGTY?

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.h:236
+  // should be sampled when it reaches zero.
+  static TLS_INITIAL_EXEC uint64_t NextSampleCounter;
+
----------------
morehouse wrote:
> Nit: static is unnecessary.
AFAICT it is required:
```
<snip>/llvm/compiler-rt/lib/scudo/../gwp_asan/guarded_pool_allocator.h:237:3: error: 'thread_local' is only allowed on variable declarations
  TLS_INITIAL_EXEC uint64_t NextSampleCounter;
```

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator_posix.cpp:71
+
+static void sigSegvHandler(int sig, siginfo_t *info, void *ucontext) {
+  if (sig != SIGSEGV)
----------------
morehouse wrote:
> This should probably forward to the previous SEGV handler after printing any GWP-ASan related things.
> 
> Doing this correctly is tricky, but you can look at how the original Google3 implementation did it for reference.
Can you PTAL and see if it LGTY?

================
Comment at: compiler-rt/lib/gwp_asan/options.inc:24
+GWP_ASAN_OPTION(
+    int, NumUsableGuardedSlots, 16,
+    "Number of usable guarded slots in the allocation pool. Defaults to 16.")
----------------
vlad.tsyrklevich wrote:
> The documentation and tests use NumUsableGuardedPages but it's not directly tested anywhere so it's not caught that this is named differently here and doesn't work.
Added a test for this.

================
Comment at: compiler-rt/lib/scudo/scudo_allocator.cpp:305
+#ifdef GWP_ASAN_HOOKS
+    if (UNLIKELY(GuardedAlloc.shouldSample())) {
+      if (void *Ptr = GuardedAlloc.allocate(Size))
----------------
morehouse wrote:
> What does the assembly for this look like on the fast path?
See attached image in description.

================
Comment at: compiler-rt/test/gwp_asan/alignment_power_of_two.cpp:18
+  for (const auto& P : AllocSizeToAlignment) {
+    char *Ptr = reinterpret_cast<char *>(malloc(P.first));
+    if (reinterpret_cast<uintptr_t>(Ptr) % P.second != 0) {
----------------
vlad.tsyrklevich wrote:
> Do we need to ensure this allocation is right-aligned? Or do both left-/right-?
Updated w/ comment in the test. We have no direct influence on left/right alignment any more, so we just run it multiple times.

================
Comment at: llvm/docs/GWPASan.rst:114
+Please note that the use-after-free detection for a guarded allocation is only
+emphemeral. We reuse inaccessible slots on-demand as we have a fixed number of
+them, and wish to avoid starving the allocation pool to solely catch
----------------
vlad.tsyrklevich wrote:
> nit: ephemeral
I think "transient" actually suits this better.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60593/new/

https://reviews.llvm.org/D60593