[PATCH] D60593: [GwpAsan] Introduce GWP-ASan.

Vlad Tsyrklevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 12 16:00:16 PDT 2019


vlad.tsyrklevich added a comment.

In D60593#1464932 <https://reviews.llvm.org/D60593#1464932>, @hctim wrote:

> I've implemented them here because it gives us a few benefits:
>
> 1. Allows us to test different configuration options at runtime without a recompile.
> 2. Gives allocators more choice as to how they use GWP-ASan (they might provide alignment guarantees that we can't).
> 3. We can A/B test deployments in production. We can have half a fleet using `OddLeftEvenRight` and half a fleet using `OddRightEvenLeft`, and this may hypothetically give us the best chance of finding bugs.
>
>   Happy to take feedback on this, if others see it as unneccessary we can delete them.


I can't imagine when #2 would ever occur, so I'd just ignore that concern and reasonably align allocations for performance by default. For #3 I also think it's not likely to usefully yield more bugs or find much use. Having a reasonable default and less complexity is preferable. Later on when we have a way to test the allocator we can see if we want to keep a 'perfectly right-align' bit that seems reasonable to me. Some MS folks mentioned they have an internal PageHeap change that does this and it finds a lot more overflows but the perf impact is noticable. (The value of those overflows is reduced though since those off-by-small-n writes are less likely to ever reach another allocation.)



================
Comment at: compiler-rt/lib/gwp_asan/allocation_metadata.h:20
+
+  struct Trace {
+    uintptr_t Trace[MaximumStackFrames];
----------------
Rename to AllocationInfo or CallSiteInfo? This already stores TID, and could potentially also be used for time. (Also in https://chromium-review.googlesource.com/c/chromium/src/+/1566455 I even move some of the stack trace out of my equivalent version of this struct.)


================
Comment at: compiler-rt/lib/gwp_asan/allocation_metadata.h:21
+  struct Trace {
+    uintptr_t Trace[MaximumStackFrames];
+    bool Collected = false; // Whether this trace was recorded.
----------------
Since this is not currently used I'd just remove this (it's not likely to be usable anyways as we don't include a length.)


================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:69
+
+  // Set the next sample counter. We should be done with all internal
+  // allocations at this point. If we were to not do this, the first allocation
----------------
Is this true? It shouldn't be based on my reading of the sampling logic. I don't think this needs to be set here nor should it be as it's only for this current thread. I'm particularly careful abut this as we really want to avoid always sampling the first allocation on a given thread. I had this bug originally in Chrome thinking it didn't matter but it turns out the first allocation would be for something that had thread-lifetime and it led to very quick allocator depletion!


================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:84
+  // details.
+  SampleRate = Opts.SampleRate - 1;
+  IsInitialised = true;
----------------
Wait, why do we subtract 1 here and then always add 1 later?


================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:302
+
+      if (AccessPtr % PageSize <= PageSize / 2) {
+        // Most likely that this was a heap-buffer-overflow. Check the LHS
----------------
I don't like this logic. If the allocation is size 4095 and left-aligned and overflows by two bytes it will be classified as an underflow to the next page. I'd use the same logic as Matt and I where we decide based on just whether it's left/right of the half way point of the guard page. (Also Left/RightMeta might not even be for valid allocations here.)


================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:321
+  } else {
+    Meta = addrToMetadata(AccessPtr);
+  }
----------------
hctim wrote:
> vlad.tsyrklevich wrote:
> > The rest of this routine assumes Meta is valid but it may not be (e.g. a wild access to a page that's never been allocated yet.)
> The invariant here is that the only time when `Meta` is invalid is with an `UNKNOWN` error. I've made the switch for `UNKNOWN` exit instead of fallthrough and added an `assert()` to clarify this invariant below.
> 
> A wild access should set `Meta` but keep `ErrorType == UNKNOWN`. Hopefully this helps.
I should have mentioned the code path that got me here. It's possible to hit this with an invalid Meta for INVALID_FREE right now.


================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.h:55
+    if (UNLIKELY(SampleRate == 0))
+      return false;
+
----------------
morehouse wrote:
> morehouse wrote:
> > hctim wrote:
> > > eugenis wrote:
> > > > morehouse wrote:
> > > > > It would be nice to avoid this check on the fast path.
> > > > > 
> > > > > Perhaps we can do `rand() % (SampleRate + 1)` and then check `SampleRate !=0` only just before we would return true.
> > > > Or it could be another special value of NextSampleCounter, which could also serve as a way to temporarily disable sampling in a thread.
> > > > 
> > > Yes and no. We would have to set `SampleRate = Opts.SampleRate - 1` to get precise sampling behaviour, which would make a sample rate of always on ("1") be the same as uninitialised.
> > > 
> > > I've added another flag `IsInitialised` in order to counteract this.
> > This doesn't seem any better.  We're still checking for initialization on the fast path.
> Sorry, disregard the last comment; it's actually not on the fast path.  This looks like it will have better perf.
Is it not still on the fast path? I thought we'd want the if (IsInitialized) check to be in if (UNLIKELY(counter ==0)) to get it out of there?


================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.h:67
+    case 1:
+      NextSampleCounter = (Random::getRandomUnsigned64() % SampleRate) + 1;
+      return true;
----------------
hctim wrote:
> vlad.tsyrklevich wrote:
> > vlad.tsyrklevich wrote:
> > > vlad.tsyrklevich wrote:
> > > > Random::getRandomUnsigned64() % SampleRate means we are sampling more often than the actual sampling rate. This should be % (SampleRate * 2), or, even better, sample from a geometric distribution
> > > Woops, seems I made the same mistake in http://crrev.com/c/1400050
> > Actually I take that back. After reading it again that code uses a complicated sampling mechanism that does sample 1/N, the reason I chose it is so that we did not undersample the first SampleRate allocations as this scheme will. If it's possible to use the standard library I'd just replace it with a geometric distribution from the beginning if you can.
> Unfortunately I'm trying to stick to header-only c++stdlib, is it okay as currently implemented?
You should still modulo by SampleRate*2, perhaps store AdjustedSampleRate = (ConfigurationSampleRate*2) + 1 and use that instead of SampleRate directly?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60593/new/

https://reviews.llvm.org/D60593





More information about the llvm-commits mailing list