[PATCH] D60593: [GwpAsan] Introduce GWP-ASan.

Mon Apr 15 16:04:07 PDT 2019

hctim added inline comments.

================
Comment at: compiler-rt/lib/gwp_asan/allocation_metadata.h:21
+  struct Trace {
+    uintptr_t Trace[MaximumStackFrames];
+    bool Collected = false; // Whether this trace was recorded.
----------------
vlad.tsyrklevich wrote:
> Since this is not currently used I'd just remove this (it's not likely to be usable anyways as we don't include a length.)
Since the change to use Trace[0]==0 to indicate missing trace/end of trace, I've kept it around, but reduced the length to 1.

================
Comment at: compiler-rt/lib/gwp_asan/allocation_metadata.h:41
+  // Whether this allocation has been deallocated yet.
+  bool IsDeallocated = false;
+};
----------------
eugenis wrote:
> This feels wasteful. The bools could be kept together to save on alignment padding. The bool in "struct Trace" seems unnecessary - could we use Trace[0] == 0 to indicate a missing trace?
Done. We can also pack this struct if need be, but I don't think we want the performance hit to save the alignment bytes (the structure should be relatively well aligned anyway).

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:23
+namespace gwp_asan {
+GuardedPoolAllocator::~GuardedPoolAllocator() {
+  if (GuardedPagePool) {
----------------
morehouse wrote:
> This is going to cause racy use-after-dtors on thread exit.
> 
> We need a trivial dtor if we want a global `GuardedPoolAllocator`, or we need to avoid destruction.
Have removed the nontrivial d'tor and explained that the class leaks if destructed during the lifetime of a program.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:52
+  Metadata = reinterpret_cast<AllocationMetadata *>(
+      malloc(NumGuardedSlots * sizeof(AllocationMetadata)));
+
----------------
eugenis wrote:
> hctim wrote:
> > cryptoad wrote:
> > > `GuardedPoolAllocator::init` is called within the constructor of `Allocator` for Scudo, so `malloc` here doesn't sound like it would work?
> > > Any heap related function should probably be avoided.
> > So there's a neat little trick we're using here. `SampleRate` is initialised intentionally at the end of `init()`, so that any call to `shouldSample()` returns false. This means that the `malloc()` calls do not fall into the guarded pool before the `GuardedPoolAllocator` is completely initialised. This trick also allows us to have GWP-ASan disabled at runtime without creating the allocation pool (and thus reducing static memory overhead).
> Could full initialization be done lazily at the first sampled malloc?
> At least, we should not allocate metadata array if the process is going to have sample rate of 0.
We don't allocate if it's disabled (see the early return in `init()` on line 40. What benefits do we get from lazy-init?

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:178
+
+void GuardedPoolAllocator::applyAllocationRightAlignment(
+    std::size_t *Size, uintptr_t *AllocationPtr) {
----------------
morehouse wrote:
> Why do we need so many options?  In general I think we need to enforce some minimal alignment, probably `alignof(std::max_align_t)`.  For smaller allocs than that, we might be able to do smaller alignments.  So maybe keep `POWER_OF_TWO` case and remove all other options.
Removed a whole bunch of logic here.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:321
+  } else {
+    Meta = addrToMetadata(AccessPtr);
+  }
----------------
vlad.tsyrklevich wrote:
> hctim wrote:
> > vlad.tsyrklevich wrote:
> > > The rest of this routine assumes Meta is valid but it may not be (e.g. a wild access to a page that's never been allocated yet.)
> > The invariant here is that the only time when `Meta` is invalid is with an `UNKNOWN` error. I've made the switch for `UNKNOWN` exit instead of fallthrough and added an `assert()` to clarify this invariant below.
> > 
> > A wild access should set `Meta` but keep `ErrorType == UNKNOWN`. Hopefully this helps.
> I should have mentioned the code path that got me here. It's possible to hit this with an invalid Meta for INVALID_FREE right now.
Gotcha. I've fixed this up, as we have two different edge cases with `INVALID_FREE`. If the `INVALID_FREE` lands on a slot that's been allocated or a guard page, we dump the information. Otherwise, we tell the user about the invalid free and stop processing there to avoid using an invalid meta.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:69
+
+  // Set the next sample counter. We should be done with all internal
+  // allocations at this point. If we were to not do this, the first allocation
----------------
vlad.tsyrklevich wrote:
> Is this true? It shouldn't be based on my reading of the sampling logic. I don't think this needs to be set here nor should it be as it's only for this current thread. I'm particularly careful abut this as we really want to avoid always sampling the first allocation on a given thread. I had this bug originally in Chrome thinking it didn't matter but it turns out the first allocation would be for something that had thread-lifetime and it led to very quick allocator depletion!
Yep - thanks!

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:84
+  // details.
+  SampleRate = Opts.SampleRate - 1;
+  IsInitialised = true;
----------------
vlad.tsyrklevich wrote:
> Wait, why do we subtract 1 here and then always add 1 later?
`NextSampleCounter` in `shouldSample()` calculates `PRNG % (SampleRate + 1)` as `SampleRate == 0` on thread-init, and would cause an arithmetic error if `malloc()` is called befire `init()` completes. If we can enfore that the allocator doesn't call `malloc()` before `init()` finishes, then we can remove this.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.h:43
+  // functions before init() has been called.
+  constexpr GuardedPoolAllocator() {}
+  GuardedPoolAllocator(const GuardedPoolAllocator &) = delete;
----------------
morehouse wrote:
> We should have a static_assert to ensure a constexpr GuardedPoolAllocator can actually be instantiated.
This is also vestigal, we don't actually need a constexpr constructor. Zero-initialisation during dynamic loading is sufficient for us to not recursively `malloc()`,  as `shouldSample()` and `pointerIsMine()` are designed to guard entry with zero-initialised members.

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.h:55
+    if (UNLIKELY(SampleRate == 0))
+      return false;
+
----------------
vlad.tsyrklevich wrote:
> morehouse wrote:
> > morehouse wrote:
> > > hctim wrote:
> > > > eugenis wrote:
> > > > > morehouse wrote:
> > > > > > It would be nice to avoid this check on the fast path.
> > > > > > 
> > > > > > Perhaps we can do `rand() % (SampleRate + 1)` and then check `SampleRate !=0` only just before we would return true.
> > > > > Or it could be another special value of NextSampleCounter, which could also serve as a way to temporarily disable sampling in a thread.
> > > > > 
> > > > Yes and no. We would have to set `SampleRate = Opts.SampleRate - 1` to get precise sampling behaviour, which would make a sample rate of always on ("1") be the same as uninitialised.
> > > > 
> > > > I've added another flag `IsInitialised` in order to counteract this.
> > > This doesn't seem any better.  We're still checking for initialization on the fast path.
> > Sorry, disregard the last comment; it's actually not on the fast path.  This looks like it will have better perf.
> Is it not still on the fast path? I thought we'd want the if (IsInitialized) check to be in if (UNLIKELY(counter ==0)) to get it out of there?
Tying to the discussion in `scudo_allocator:236`, I think that we should scrap calling `malloc()` from within GwpAsan's init. This would allow us to not have an initialisation check entirely, with the strict requirement that `GuardedPoolAllocator::init()` is called before any malloc takes place. WDYT?

================
Comment at: compiler-rt/lib/gwp_asan/guarded_pool_allocator.h:190
+  // The number of guarded slots that this pool holds.
+  std::size_t NumGuardedSlots = 0;
+  // Record the number of currently used slots. We store this amount so that we
----------------
eugenis wrote:
> What's the distinction between slots and pages (as in FreePages below)?
A slot can be multiple-pages in this implementation - but yes some the member names were vestigal to the previous implementation. Have updated this.

================
Comment at: compiler-rt/lib/gwp_asan/mutex.h:34
+
+  std::atomic<bool> Semaphore{false};
+};
----------------
pcc wrote:
> hctim wrote:
> > cryptoad wrote:
> > > Does using `std::atomic` bring in a libc++ (or equivalent) dependency?
> > It brings in a libc++ dependency when building GWP-ASan, but no further than that (i.e. does not affect Scudo). `std::atomic<bool>` is header-only, and although we don't have any strong guarantess for this, I believe it to be a safe bet.
> > 
> > @pcc - can you confirm that this is a safe bet?
> I don't believe that the "header-only" subset of the standard library is formally defined, so in theory there could be a library dependency here but it seems unlikely. If we want a guarantee of no library dependency, we could always just use the `__atomic` builtins here like scudo is doing.
Moved onto `__atomic` builtins instead.

================
Comment at: compiler-rt/lib/gwp_asan/optional/runtime_env_flag_parser.h:19
+// Parse the options from the GWP_ASAN_FLAGS environment variable.
+void initOptions();
+// Returns a pointer to the initialised options. Call initFlags() prior to
----------------
morehouse wrote:
> Can we outsource all flag parsing and just have GuardedPoolAllocator configured during init?
We do. Sorry if this isn't clear in the design.

The `optional` directory (and build flags) basically are a set of features that an allocator can choose to use if they don't want to reimplement the functionality. In this case, the optional component is the runtime environment flag parser, which has a dependency on sanitizer_common.

When Scudo merges its own flag parsing, they can then parse GwpAsan flags by their own methodology, as the GwpAsan options are passed to the allocator in `init()`. The configuration of GwpAsan is orthogonal to how the options are parsed.

================
Comment at: compiler-rt/lib/scudo/scudo_allocator.cpp:243
+    : Quarantine(LINKER_INITIALIZED) {
+      gwp_asan::options::initOptions();
+      GuardedAlloc.init(*gwp_asan::options::getOptions());
----------------
eugenis wrote:
> hctim wrote:
> > cryptoad wrote:
> > > We are avoiding any non-trivial initialization on static objects. Ideally those 2 would go in `init`.
> > So this kind of conflicts with my previous statement. If we move this initialisation into `init()`, we will recurse into `init() -> Allocator::allocate() -> initThreadMaybe() -> init()` and we deadlock on `pthread_once`.
> > 
> > We have two options here:
> > 1. Leave the GwpAsan initialisation in the c'tor, which forces the initialisation of Scudo at startup, rather than lazily initialising.
> > 2. We avoid malloc() and instead do our own memory management in GwpAsan's `init()`. 
> > 
> > @eugenis - What do you think is the right long-term support option for GwpAsan here?
> how about
> 3. Do "fast pass" initialization in init(), then do full initialization on first sampled malloc
> 
> "fast pass" initialization is just enough to do sampling, i.e. the inlined part of shouldSample().
> 
> Since init() can run in a secondary thread, the access to SampleRate can race and needs to be changed to an atomic.
So this would be a problem with lazy-init of GWP-ASan. For lazy-initialisation, we would require  `NextSampleCounter`, `SampleRate`, `GuardedPagePool`, and `GuardedPagePoolEnd` to all be atomic (and at least one with . At the moment, `init()` is run before any allocations take place, so it shouldn't race. Changing these to atomics would incur a significant performace overhead to the allocator's fast path.

I'm leaning towards #2 here, and requiring the c'tor to initialise GwpAsan before any allocations take place.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60593/new/

https://reviews.llvm.org/D60593