[PATCH] D19921: [esan] EfficiencySanitizer shadow memory

Thu May 5 01:31:32 PDT 2016

filcab added a comment.

> Making it #if LINUX should be sufficient. This matches what tsan does.

It's still very Linux-specific, and we don't need to over-specialize the sanitizer when stopping at POSIX is more than enough.
TSan does this and look at the mess of `#if` it has for things that are called only once. Especially bad since many of them are not performance-critical (e.g: `SIG*` definitions), and we end up making a mess, but gaining next to nothing.

Sometimes we end up with `#if SANITIZER_LINUX` (or another OS) inside a function because it's performance-sensitive, or extracting all the information needed for the Linux-specific code would be a big refactoring and not worth it. This is not one of those cases.

> ... it is not much more complex than other mappings: it simply varies the offset...

That is more complex, though. Both me and Mike asked about this, so it seems like a better explanation would be nice.
I am not seeing a reason for the different offsets. I understand it's all a simple transformation to get each offset, but the "why" still eludes me (can't talk for Mike, though).

================
Comment at: lib/esan/esan.cpp:72
@@ +71,3 @@
+      DCHECK(isAppMem(AppStart));
+      DCHECK(!isAppMem(AppStart - 1));
+      DCHECK(isAppMem(AppEnd - 1));
----------------
Not really. ASan has one special case where there's more than a shadow region.
All the others have one shadow region (divided into low shadow, high shadow, and shadow gap).

You mention that "Any direct-map shadow approach has to use different shadow regions for the different 64-bit app regions", what's missing is the "why?".

I might be missing something, but like I said, I don't see a reason to have all this complexity.
Having a single, big, shadow region would also help with not needing to mask addresses in the memToShadow transformation. This also avoids having problems in the `appToShadow(appToShadow(x))` transformation, as that hits the shadow gap, which has no read nor write permissions.

================
Comment at: lib/esan/esan.cpp:74
@@ +73,3 @@
+      DCHECK(isAppMem(AppEnd - 1));
+      DCHECK(!isAppMem(AppEnd));
+      DCHECK(!isShadowMem(AppStart));
----------------
Right, you have white-listed some app memory, and we end up with `appToShadow(AppEnd)` not falling into shadow memory because there are blanks in the address space that aren't "app" nor "shadow", is that so?

================
Comment at: lib/esan/esan_interceptors.cpp:427
@@ +426,3 @@
+
+static bool fix_mmap_addr(void **addr, SIZE_T sz, int flags) {
+  if (*addr) {
----------------
`fixMmapAddr`

================
Comment at: lib/esan/esan_interceptors.cpp:438
@@ +437,3 @@
+    if (!SingleApp) {
+      VPrintf(1, "mmap conflict: %p-%p is not in an app region\n",
+              *addr, (uptr)*addr + sz);
----------------
`[%p, %p)`

================
Comment at: lib/esan/esan_shadow.h:18
@@ +17,3 @@
+
+#if !defined(__LP64__) && !defined(_WIN64)
+#error "Only 64-bit is supported"
----------------
`#include <sanitizer_common/sanitizer_platform.h>
#if SANITIZER_WORDSIZE !=64
#error ...
`

================
Comment at: lib/esan/esan_shadow.h:24
@@ +23,3 @@
+
+#if defined(__x86_64__)
+// Linux/FreeBSD x86_64
----------------
`#if SANITIZER_LINUX`

================
Comment at: lib/esan/esan_shadow.h:25
@@ +24,3 @@
+#if defined(__x86_64__)
+// Linux/FreeBSD x86_64
+//
----------------
`Linux`. We've already seen differences in mapping for Linux and FreeBSD, so I wouldn't lump them together unless an actual port is in the way and the same mapping works.

================
Comment at: lib/esan/esan_shadow.h:48
@@ +47,3 @@
+// We use Umbra's scheme as it was designed to support different
+// offsets, it supports two different shadow mappings (which we may want to
+// use for future tools), and it ensures that the shadow of a shadow will
----------------
Nothing in this patch mentions "why" supporting different offsets is a priority. Is it not possible to do just the one offset (with possibly varying scales) on Linux due to virtual address space fragmentation?
Are there performance concerns which we can address when we scale down the shadow, but can't when the shadow is very large?

I really want to avoid this code being more complex than it needs to. Especially since it will likely change among different OS.

================
Comment at: lib/esan/esan_shadow.h:78
@@ +77,3 @@
+// shadow(shadow) ends up disjoint from shadow(app):
+//
+// [0x00001400'00000000, 0x00001500'00000000)
----------------
`Mapping` is more ambiguous, and it's a top-level (in ESan) name.
`ShadowMapping` is explicitly for a shadow memory mapping. We only have the one class and is easily reachable everywhere in ESan. It should be more explicit.

http://reviews.llvm.org/D19921