[PATCH] D19921: [esan] EfficiencySanitizer shadow memory

Wed May 4 10:58:10 PDT 2016

filcab added a subscriber: filcab.
filcab added a comment.

The OS-related stuff like fix_mmap_addr should be in a esan_posix.cc file, which can include external headers (like the ones in asan do).
That way you don't need to hard-code values or anything, you just include the required headers.

Can you explain a bit better why there's so much complexity around the mapping? I have some inline comments about it.

Thank you,

Filipe

================
Comment at: lib/esan/esan.cpp:68
@@ +67,3 @@
+
+  VPrintf(1, "Shadow scale=%d offs=%p\n", ShadowMapping.ShadowScale,
+          ShadowMapping.ShadowOffs);
----------------
Don't abbreviate "offset".

================
Comment at: lib/esan/esan.cpp:72
@@ +71,3 @@
+  uptr AppStart, AppEnd;
+  for (int i = 0; getAppRegion(i, &AppStart, &AppEnd); i++) {
+    uptr ShStart = appToShadow(AppStart);
----------------
Why loop over the regions and mapping them to different shadow regions? Why not do the same as ASan and use the system's virtual memory to your advantage, by simply mapping a huge region which covers everything you want, especially since your shadow is so much smaller?

================
Comment at: lib/esan/esan.cpp:73
@@ +72,3 @@
+  for (int i = 0; getAppRegion(i, &AppStart, &AppEnd); i++) {
+    uptr ShStart = appToShadow(AppStart);
+    uptr ShEnd = appToShadow(AppEnd - 1) + 1; // Do not pass end itself!
----------------
`ShadowStart`

================
Comment at: lib/esan/esan.cpp:74
@@ +73,3 @@
+    uptr ShStart = appToShadow(AppStart);
+    uptr ShEnd = appToShadow(AppEnd - 1) + 1; // Do not pass end itself!
+    VPrintf(1, "Shadow #%d: %zx-%zx (%zuGB)\n", i, ShStart, ShEnd,
----------------
Why?
Do we want the byte after the last shadow byte and this may not be the same as the shadow of the byte past the last byte of the region? When does that happen?

================
Comment at: lib/esan/esan.cpp:75
@@ +74,3 @@
+    uptr ShEnd = appToShadow(AppEnd - 1) + 1; // Do not pass end itself!
+    VPrintf(1, "Shadow #%d: %zx-%zx (%zuGB)\n", i, ShStart, ShEnd,
+            (ShEnd - ShStart) >> 30);
----------------
`[start,end)` is a better way to show half-open ranges (like what ASan does when it shows the mapping).

================
Comment at: lib/esan/esan_interceptors.cpp:26
@@ +25,3 @@
+// We define the few constants we need here:
+const int EINVAL = 22; // from /usr/include/asm-generic/errno-base.h
+const int MAP_FIXED = 0x10; // from /usr/include/sys/mman.h
----------------
No.
This should, at the very least, be under an `#if SANITIZER_LINUX`, since it's clearly Linux only.
I think we might end up not needing this, either.

================
Comment at: lib/esan/esan_interceptors.cpp:425
@@ +424,3 @@
+  if (*addr) {
+    if (!isAppMem((uptr)*addr) || !isAppMem((uptr)*addr + sz - 1)) {
+      VPrintf(1, "mmap conflict: %p-%p is not in an app region\n",
----------------
What if `addr+sz` gets to the next region?

================
Comment at: lib/esan/esan_shadow.h:47
@@ +46,3 @@
+//
+//   shadow(app) = ((app & 0x00000fff'ffffffff) + offs) >> scale
+//
----------------
Don't abbreviate "offset" (same for next line).

================
Comment at: lib/esan/esan_shadow.h:50
@@ +49,3 @@
+// Where offs for 1:1 is 0x00001200'00000000 and for others is tweaked
+// from a pure "<< scale" strategy:
+//   scale == 3: 0x0000900'000000000
----------------
What is that strategy, and why the tweaks?

================
Comment at: lib/esan/esan_shadow.h:54
@@ +53,3 @@
+//   scale == 1: 0x0000220'000000000
+//   scale == 0: 0x0000120'000000000
+//
----------------
Is this better than having a simple offset? I'm unclear on what this complexity is trying to accomplish.

================
Comment at: lib/esan/esan_shadow.h:77
@@ +76,3 @@
+// that need no data load.
+class Mapping {
+public:
----------------
`ShadowMapping`

================
Comment at: lib/esan/esan_shadow.h:86
@@ +85,3 @@
+  static const uptr App4Start  = 0xffffffffff600000u;
+  static const uptr App4End    = 0xffffffffff601000u;
+  static const uptr ShadowMask = 0x00000fffffffffffu;
----------------
An array of regions would be better than a bunch of constants.

================
Comment at: lib/esan/esan_shadow.h:90
@@ +89,3 @@
+  uptr ShadowScale;
+  uptr ShadowOffs;
+  void initMapping(uptr Scale) {
----------------
`Scale`
`Offset`
(We should just mention it's "shadow"-related on the type and be done with it)

================
Comment at: lib/esan/esan_shadow.h:91
@@ +90,3 @@
+  uptr ShadowOffs;
+  void initMapping(uptr Scale) {
+    static const uptr ShadowOffsArr[4] = {
----------------
`init(uptr ShadowScale)`

================
Comment at: lib/esan/esan_shadow.h:92
@@ +91,3 @@
+  void initMapping(uptr Scale) {
+    static const uptr ShadowOffsArr[4] = {
+        0x0000120000000000u,
----------------
`OffsetArray`

================
Comment at: lib/esan/esan_shadow.h:105
@@ +104,3 @@
+};
+extern Mapping ShadowMapping;
+#else
----------------
I would just call it `Mapping`, unless it could be confused with anything, in which case the current name is ok.

================
Comment at: lib/esan/esan_shadow.h:157
@@ +156,3 @@
+bool isShadowMem(uptr Mem) {
+// We assume this is only really used for debugging and so there's
+// no need to hardcode the mapping results.
----------------
Should we `#ifndef NDEBUG`, to make sure it's only there in debug versions?

http://reviews.llvm.org/D19921